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REMARKS 

Claims 37-50 were pending in the subject application. By this 
Amendment, applicants have hereinabove amended claims 37, 42, 43, 
48-50 and added new claim 51. 

Applicants maintain that the amendments to the claims are fully- 
supported by the specification as originally filed and do not 
raise any issue of new matter. Accordingly, applicants 
respectfully request entry of this Amendment. 

Claims Objections 

In section 6 of the August 29, 2008 Office Action, the Examiner 
objected to claims 43 and 48 for informalities. Specifically, 
the Examiner objected to claim 43 for lacking the recitation 
"plant" after "selected" and objected to claim 48 for not 
reciting the full form of "ERECTA. " 

In response, in order to expedite prosecution but without 
conceding the correctness of the Examiner's position applicants 
have amended claims 43 to recite "...wherein the method further 
comprises propagating the selected plant." Applicants have 
amended claim 48 to recite, "...transforming a culture of plant 
cells with the full-form of the ERECTA gene..." Accordingly, 
applicants respectfully maintain that claims 43 and 48 are not 
objectional and respectfully request the Examiner reconsider and 
withdraw this objection. 

Claim rejections under 35 U.S.C. §112 , Indefinite 

In section 7 of the August 29, 2008 Office Action, the Examiner 
rejected claims 37-50 under 35 U.S.C. §112, second paragraph, as 
allegedly indefinite for failing to particularly point out and 
distinctly claim the subject matter which applicant regards as 
the invention. 
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Claims 37, 42, and 48 

The Examiner alleged that claims 37, 42 and 48 are indefinite for 
the recitation "...transcribed to form a transcription product 
which is then expressed..." The Examiner asserted that the 
transcription product is itself an expression product stating 
that, "A transcription product is suppose to under translation to 
produce the protein." 

In response, applicants respectfully direct the Examiner's 
attention to Lubert Stryer, et al . Biochemistry, 2002 5th 
edition, copy of relevant pages attached as Exhibit A which 
defines gene expression as "the combined process of the 
transcription of a gene into mRNA, the processing of that mRNA, 
and its translation into protein (for protein-encoding 
genes)." However, to expedite prosecution but without conceding 
the correctness of the Examiner's position, applicants have 
herein amended claims 37, 43, and 47 to no longer recite the 
phrase "which is then expressed." 

Claim 48 

The Examiner alleged that claim 48 is indefinite for the 
recitation "gene" which is asserted to be confusing since the 
limitation "gene" implies that the structure comprises the coding 
sequence and the associated promoter, terminator and enhancer 
encoding regions are also a part of the structure (see The 
Federal Register, Vol. 66, No. 4, Friday, January 5, 2001 at page 
1108, left column, Endnote 13). The Examiner indicated that all 
subsequent recitations of "gene" are also rejected. 

In response, applicants respectfully traverse the Examiner's 
rejection. The Examiner asserted that in the instant case, 
applicants do not appear to describe such ERECTA gene associated 
nucleic acid sequences. Applicants respectfully point out that 
the sequence of the ERECTA gene was known and published prior to 
the filing date, for example as noted in the present 
specification at pages 33 and 34 in which alleles of the 
ERECTA gene in Arabidopsis are identified. Torii et al., 
(1996) for example provides the nucleotide and deduced amino 



Applicants : Josette Masle et al . 
Serial No. : 10/519, 135 

Page 7 of 15: Amendment in Response to August 29, 2008 Office 
Action 

acid sequence of the ER gene (see for example Figure 5 and the 
description of Figure 5 at page 740 of Torii et al.) and also in 
GeneBank/EMBL/DDB/ under accession numbers U47029 and D83257 
for the cDNA and genomic DNA sequence, respectively, with the 
positions of introns and exons . Accordingly, the gene sequence 
is part of the prior art. Furthermore, the specification 
describes that an ERECTA gene may be used (at pages 5 5 to 5 9 of 
the specification as filed, paragraphs [0277] - [0296] of the 
specification as published) , and that the presence of a promoter 
(pages 57-59) , and a terminator is clearly contemplated. 

Applicant have added new claim 51 reciting all the limitations of 
claim 48, with the exception that the culture of plant cells are 
transformed with "...a nucleic acid encoding an ERECTA 
protein. . . " 

Claim 4 9 and 50 

The Examiner alleged that the recitation of "seeds from a 
selected plant," in claims 49 and 50 are confusing, asserting 
that it is unclear whether the seeds comprise the nucleic acid. 

In response, to expedite prosecution but without conceding the 
correctness of the Examiner's position, applicants have herein 
amended claims 49 and 50, to recite, "...wherein said seeds 
comprise the nucleic acid. " 

Accordingly, applicants respectfully request that the Examiner 
reconsider and withdraw these anticipation rejections. 
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Rejections under 35 U.S.C. §112 , Written Description 

In section 8 of the August 29, 2008 Office Action, the Examiner 
rejected claim 48 under 35 U.S.C. §112, first paragraph, as 
allegedly not complying with the written description requirement. 
The Examiner alleged that the claim contains subject matter which 
was not described in the specification in such a way as to 
reasonably convey to one skilled in the relevant art that the 
inventors, at the time the application was filed, had possession 
of the claimed invention. The Examiner's detailed reasons are at 
forth on page 5-7 of the August 29, 2008 Office Action. 



Applicants' Reply 

In response, applicants respectfully traverse the Examiner's 
rejection. The Examiner asserts that claim 48 is not 
sufficiently described, as the specification does not describe 
the structure of ERECTA genes isolated from diverse sources and 
genetic backgrounds. Applicants respectfully note that the 
Examiner has not acknowledged that Examples 12, 14 and 15 set 
out in detail methods which were used to identify orthologs of 
Arabldopsls ERECTA in sorghum, wheat and maize, and in Example 13 
ERECTA homologs in Arabldopsls thallana. These homologs and 
orthologs are also set out in Figures 12 to 15 of the present 
specification . 

As discussed above, ERECTA was known prior to the filing date of 
the present application. In studies relating to the 
characteristics and phylogeny of Receptor-Like Kinases in 
Arabldopsls , Shiu and Bleeker (2001) , attached hereto as Exhibit 
B, found that there were about 200 Leucine -Rich-Repeat Receptor 
like Kinases ( LRR-RLK) which were easily recognisable by the 
simple sequence analysis for the presence of "repeats" in the 
receptor domain. Within the LRR-RLK, further division into 
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subfamilies were made on the basis of other structural/sequence 
features, and Shiu and Beeker provides the precise 
characteristics necessary to identify respective LRR-RLKs. 
ERECTA is a member of the LRR-RLK group, belonging to subclass 
LRR XIII set out in Figure 2 at page 10766 of Shiu and Beeker 
(2001) . Figure 4 of this paper identifies only 7 members of this 
LRR XIII subclass in Arabidopsis . 

Using methods described in the present specification, for 
instance as exemplified by the applicants in Examples 12 to 15, a 
person of skill in the art at the filing date would readily be 
able to identify and characterise ERECTAs from amongst the family 
of Leucine-Rich-Repeat Receptor-Like Kinases. 

The Examiner asserts that Shpak et al. (2004) suggests that 
ERECTA genes are involved in diverse cellular processes, and thus 
the present broad claim encompasses structures whose function is 
unrelated to the ERECTA polypeptide of SEQ ID NO : 2 . 

In response it is noted that the Examiner appears to have assumed 
that a diversity of function of ERECTA is based in a diversity of 
structure. This assumption is not supported by fact. For 
example, the Examiner has asserted in paragraph 9 of the 
Office Action that Japanese patent publication No. JP 
09056382A ("Mitsukawa") discloses the use of a polypeptide of SEQ 
ID NO: 2. Mitsukawa discloses that a polynucleotide encoding a 
polypeptide of SEQ ID NO: 2 is associated with the control of 
elongation of plant stems. In the present specification, 
however, the polypeptide of SEQ ID NO: 2 is demonstrated to be 
associated with the distinct and unrelated function of 
t r ansp i rat i ona 1 e f f i c i ency . 

It is respectfully submitted that the same ERECTA may therefore 
be involved in a diversity of functions. This diversity of 
function may, for example, be a result of the region of the plant 
genome into which the ERECTA is introduced, the copy number of 
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the ERECTA which is expressed, and the degree to which the 
plant's endogenous ERECTA is deficient or sub optimal. It may 
also be related to the cell and tissue specificity of the ERECTA 
protein expression . Thus the present specification reveals that 
ERECTA regulates stomatal density (see for example Figure 2a) , 
and hence is involved in a pathway associated with stomatal 
patterning, which is quite distinct from a pathway regulating 
elongation of cells in the reproductive stems. The claims of the 
present specification, but not the prior art, require the 
selection of plants having enhanced transpirational efficiency. 
Suitable methods for identifying the level of transpirational 
efficiency of a plant are described in detail in the 
specification. 

Accordingly, it is respectfully submitted that the specification 
fully describes claim 48 and applicants request that the Examiner 
reconsider and withdraw the rejection for lack of written 
description. 

Rejections under 35 U.S.C. §§102 (b) and 103(a) 

In section 9 of the August 29, 2008 Office Action, the Examiner 
rejected claims 37-50 under 35 U.S.C. §102 (b) as allegedly 
anticipated by or, in the alternative, under 35 U.S.C. §103 (a) as 
allegedly obvious over Mitsukawa et al . (Japanese Patent 
Publication No. JP 09056382 A, published March 4, 1997) and 
evidenced by Masle et al . (Nature, 436:866-870, 2005). 

Applicants 9 Response 

Rejections under 35 U.S.C. §102 (b) 

In response, applicants respectfully traverse the Examiner's 
rejection. The Examiner asserts that Mitsukawa discloses a 
method of producing a transgenic plant which would inherently 
possess the properties possessed by the presently claimed 



Applicants : Josette Masle et al . 
Serial No. : 10/519, 135 

Page 11 of 15: Amendment in Response to August 29, 2008 Office 
Action 

invention. 

The Examiner has misapplied the law of inherent anticipation, 
and consequently has made an improper Rejections under 3 5 
U.S.C. §102. 

As cited in M.P.E.P. §2112 with regard to inherent anticipation, 
n [t]he fact that a certain result or characteristic may occur or 
be present in the prior art is not sufficient to establish the 
inherency of that result or characteristic. In re Rijckaert, 9 
F.3d 1531, 1534, 28 USPQ2d 1955, 1957 (Fed. Cir. 1993)". More 
specifically, " [t] o establish inherency, the extrinsic evidence 
•must make clear that the missing descriptive matter is 
necessarily present in the thing described in the reference, and 
that it would be so recognized by persons of ordinary skill. 
Inherency, however, may not be established by probabilities or 
possibilities. The mere fact that a certain thing may result from 
a given set of circumstances is not sufficient.' In re Robertson, 
169 F.3d 743, 745, 49 USPQ2d 1949, 1950-51 (Fed. Cir. 1999)" 
(M.P.E.P. §2112) (emphasis added). Accordingly, the Examiner's 
unsupported statement that "such a property [enhanced 
transpiration efficiency] would be inherent to the method of 
expressing the protein (accession no. AAW13408) in Mitsukawa et 
al . transgenic plant" is insufficient basis for an anticipation 
rejection based on inherency. The Examiner has failed to cite any 
evidence that enhanced transpiration efficiency would 
necessarily result. 

Applicants respectfully submit a verified English translation of 
Mitsukawa, together with a copy of the complete Japanese document 
which was translated for the Examiner's convenience, and as a 
substitute for the computer translation of this citation which 
the Examiner has used previously are attached hereto as Exhibits 
C, D and E respectively. 

Applicants respectfully submit that one of ordinary skill in the 
art at time of filing would not necessarily arrive at a plant 
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with enhanced transpirat ional efficiency when following the 
method provided by Mitsukawa. 

As submitted in previous responses, Mitsukawa does not disclose 
the step of selecting a plant with enhanced transpirational 
efficiency as required by the claims. Instead, Mitsukawa 
discloses a different step of identifying a plant with 
enhanced stem length. Applicants submit that there is no 
correlation between stem length and transpirational efficiency 
in plants with a given ERECT A . 

The applicants provide below evidence supporting the lack of 
correlation. 

Firstly, the applicants demonstrate that polymorphisms in ERECTA 
sequence which lead to significant changes in morphogenesis (as 
described in the Mitsukawa) do not necessarily lead to changes in 
transpiration efficiency. The graph which is attached hereto as 
Exhibit F shows the relationship obtained by the applicants 
between transpiration efficiency and stem length among a range of 
Arabldopsls accessions which are polymorphic in ERECTA. The Y 
axis plots the plant's 13 C isotope discrimination, as a measure 
of transpirational efficiency (with decreasing 13 C discrimination 
associated with increased transpirational efficiency either due 
to more closed stomata or to increased photosynthetic capacity or 
to both as is the case for ERECTA) . The X axis plots the height 
of the mature plant as measured by the height of the 
inflorescence. Plots of individual accessions are set out as 
individual points. This graph shows that for any given plant 
stem height, a range of transpirational efficiencies may be 
obtained, with no correlation between stem height and 13 C 
discrimination. No statistically significant association between 
13 C discrimination and plant height was found over the whole 
range. Thus, in the light of this data, it is submitted that for 
any given ERECTA allele there is no correlation between the plant 
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height which is obtained and the transpirat ional efficiency of 
that plant . 

Secondly, the applicants demonstrate that while the degree of 
overexpression of ERECTA tends to correlate with increased 
transpirational efficiency, it does not correlate with plant 
height. Two graphs are provided attached hereto as Exhibit G. 
These graphs plot the relationship between the degree of ERECTA 
expression in ERECTA overexpressing plant lines (fold change in 
expression compared to ERECTA expression in Columbia allele 
(normalised to a control gene) on X-axis) and either 13 C 
discrimination (as a measure of transpirational efficiency, top 
graph) or plant height at maturity (bottom graph) on the 
respective Y axes. 

Although there is a trend for plants with increasing ERECTA 
expression to have enhanced transpirational efficiency beyond 
wild type levels, the bottom graph shows that apart from null 
expression lines there was no clear correlation between the level 
of ERECTA over expression and plant height. Accordingly, the 
selection of a plant with increased height does not necessarily 
result in the selection of a plant with enhanced transpirational 
efficiency. 

Accordingly it is respectfully submitted that the disclosure of 
a process for making plants with enhanced stem length in 
Mitsukawa does not inherently disclose the methods claimed in the 
present application . 

Rejections under 35 U.S.C. §102 (b) 

Further, in the light of the above, it is respectfully submitted 
that Mitsukawa also does not render the claims obvious. The 
Examiner stated that, "It would have been obvious to one of 
ordinary skill in the art to select for transgenic plant with 
increased transpiration efficiency (inherently associated 
property of polynucleot idesequence disclosed in the reference) 
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because selection of transgenic plant with a phenotype would have 
been the ultimate useful goal without any surprises or unexpected 
results." Applicants are unclear as to what is being referred to 
as "phenotype." The Examiner does not segregate the different 
phenotypes at issue here and simply blurs them in a genetic 
"phenotype," thereby ignoring what applicants are actually 
claiming. As such the rejection does not deal with claimed 
invention and is fatally defective. If stem length is the 
phenotype one would aim to achieve, one skilled in the art would 
not necessarily obtain the claimed invention, i.e. a plant 
with enhanced transpirat ional efficiency. Mitsukawa teaches 
the selection of plants with increased height, which as set 
out above, does not necessarily correlate with transpirat ional 
efficiency. The Examiner has not identified any evidence which 
teaches that ERECTA is associated with transpirat ional 
efficiency. Applicants have on the other hand presented 
evidence showing that promotional efficiency does not correlate 
with stem length. 

If a telephone interview would be of assistance in advancing 
prosecution of the subject application, applicants 1 undersigned 
attorney invites the Examiner to telephone him at the number 
provided below. 



Applicants : Josette Masle et al . 
Serial No. : 10/519, 135 

Page 15 of 15: Amendment in Response to August 29, 2008 Office 



No fee, other than the enclosed $1,110.00 fee for a three-month 
extension of time is deemed necessary in connection with the 
filing of this Amendment. However, if any additional fee is 
required authorization is hereby given to charge the amount of 
any such fee to Deposit Account No. 03-3125. 



Action 



Respectfully submitted, 




I hereby certify that this 
correspondence is being deposited on 
this date with the U.S. Postal 
Service with sufficient postage as 
first class mail in an envelope 
addressed to: 



John P. Wnite 
Registration No. 28,678 
Gary J . Gershik 
Registration No. 39,992 
Attorneys for Applicants 
Cooper Sc Dunham LLP (Customer #23432) 
3 0 Rockefeller Plaza 
2 0 th Floor 

New York, New York 10112 
(212) 278-0400 
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31. The Control of Gene Expression 

Bacteria are highly versatile and responsive organisms: the rate of synthesis 
of some proteins in bacteria may vary more than a 1 000-fold in response to 
the supply of nutrients or to environmental challenges. Cells of multicellular 
organisms also respond to varying conditions. Such cells exposed to 
hormones and growth factors will change substantially in shape, growth rate, 
and other characteristics. Moreover, many different cell types are present in 
multicellular organisms. For example, cells from muscle and nerve tissue 
show strikingly different morphologies and other properties, yet they contain 
exactly the same DNA. These diverse properties are the result of differences 
in gene expression. 

Gene expression is the combined process of the transcription of a gene into 
mRNA, the processing of that mRNA, and its translation into protein (for 
protein-encoding genes). A comparison of the gene-expression patterns of 
cells from the pancreas, which secretes digestive enzymes, and the liver, the 
site of lipid transport and energy transduction, reveals marked differences in 
the genes that are highly expressed ( Table 31.1 ), a difference consistent with 
the physiological roles of these tissues. 

How is gene expression controlled? Gene activity is controlled first and 
foremost at the level of transcription. Much of this control is achieved 
through the interplay between proteins that bind to specific DNA sequences 
and their DNA-binding sites. In this chapter, we shall see how signals from 
the environment of a cell can alter this interplay to induce changes in gene 
expression. We first consider gene-regulation mechanisms in prokaryotes and 
particularly in E. coli, because these processes have been extensively 
investigated in this organism. We then turn to eukaryotic gene regulation. In 
the chapter's final section, we explore mechanisms for regulating gene 
expression past the level of transcription, t tdp 
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Receptor-like kinases from Arabidopsis form a 
monophyletic gene family related to animal 
receptor kinases 

Shin-Han Shiu and Anthony B. Bleecker* 

Department of Botany and Laboratory of Genetics, University of Wisconsin, Madison, Wl 53706 
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Plant receptor-like kinases (RLKs) are proteins with a predicted 
signal sequence, single transmembrane region, and cytoplasmic 
kinase domain. Receptor-like kinases belong to a large gene family 
with at least 610 members that represent nearly 2.5% of Arabi- 
dopsis protein coding genes. We have categorized members of this 
family into subfamilies based on both the identity of the extracel- 
lular domains and the phylogenetic relationships between the 
kinase domains of subfamily members. Surprisingly, this structur- 
ally defined group of genes is monophyletic with respect to kinase 
domains when compared with the other eukaryotic kinase families. 
In an extended analysis, animal receptor kinases, Raf kinases, plant 
RLKs, and animal receptor tyrosine kinases form a well supported 
group sharing a common origin within the superfamily of serine/ 
threonine/tyrosine kinases. Among animal kinase sequences, Dro- 
sophila Pelle and related cytoplasmic kinases fall within the plant 
RLK clade. which we now define as the RLK/Pelle family. A survey 
of expressed sequence tag records for land plants reveals that 
mosses, ferns, conifers, and flowering plants have similar percent- 
ages of expressed sequence tags representing RLK/Pelle ho- 
mologs, suggesting that the size of this gene family may have been 
close to the present-day level before the diversification of land 
plant lineages. The distribution pattern of four RLK subfamilies on 
Arabidopsis chromosomes indicates that the expansion of this 
gene family is partly a consequence of duplication and reshuffling 
of the Arabidopsis genome and of the generation of tandem 
repeats. 

The ability to perceive and process information from chemical 
signals via cell surface receptors is a basic property of all 
living systems. In animals, the family of receptor tyrosine kinases 
(RTKs) mediates many signaling events at the cell surface (1, 2). 
This class of receptors is defined structurally by the presence of 
a ligand-binding extracellular domain, a single membrane- 
spanning domain, and a cytoplasmic tyrosine kinase domain. In 
plants, receptor-like kinases (RLKs) are a class of transmem- 
brane kinases similar in basic structure to the RTKs (3). In 
Arabidopsis alone, it has been reported that there are more than 
300 RLKs (4, 5). In the limited cases where a functional role has 
been identified for plant RLKs, they have been implicated in a 
diverse range of signaling processes, such as brassinosteroid 
signaling via BRI1 (6), meristem development controlled by 
CLV1 (7), perception of flagellin by FLS2 (8), control of leaf 
development by Crinkly4 (9), regulation of abscission by 
HAESA (10), self-incompatibility controlled by SRKs (11), and 
bacterial resistance mediated by Xa21 (12). Putative ligands for 
SRK (13, 14), CLV1 (15, 16), BRI1 (17), and FLS2 (18) have 
recently been identified. Proteins interacting with the kinase 
domains of RLKs in vitro have also been found (19-21). 

Plant RLKs can be distinguished from animal RTKs by the 
finding that all RLKs examined to date show serine/threonine 
kinase specificity, whereas animal receptor kinases, with the ex- 
ception of transforming growth factor- 0 (TGF-0) receptors, are 
tyrosine kinases. In addition, the extracellular domains of RLKs are 
distinct from most ligand-binding domains of RTKs identified so far 
(1, 2). These differences raise the question of the specific evolu- 
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lionary relationship between the RTKs and RLKs within the 
recognized superfamily of related eukaryotic serine/threonine/ 
tyrosine protein kinases (ePKs). An earlier phylogenetic analysis 
(22), using the six RLK sequences available at the time, indicated 
a close relationship between plant sequences and animal RTKs, 
although RLKs were placed in the "other kinase" category. A more 
recent analysis using only plant sequences led to the conclusion that 
the 18 RLKs sampled seemed to form a separate family among the 
various eukaryotic kinases (23). The recent completion of the 
Arabidopsis genome sequence (5) provides an opportunity for a 
more comprehensive analysis of the relationships between these 
classes of receptor kinases. 

To understand the evolution of the RLK family and its 
relationship with other kinase families and provide a framework 
to facilitate the prediction of RLK function, we set out to 
conduct a genome-wide survey of RLK-related sequences in 
Arabidopsis. Through a phylogenetic analysis of the conserved 
kinase domains, we sought to determine (i) whether RLKs 
belong to a monophyletic group when compared with other ePKs 
and («) how the RLKs are related to animal receptor kinases. To 
investigate the relationship between the evolution of land plants 
and the expansion of the RLK family, we performed a survey of 
expressed sequence tags (ESTs) for a variety of organisms. 
Finally, we looked into the chromosomal distribution of four 
RLK subfamilies to investigate the potential mechanisms con- 
tributing to the expansion of this gene family in Arabidopsis. 

Materials and Methods 

Sequence Selection. RLKs. All published plant RLK sequences were 
retrieved, and their kinase domain sequences were used to conduct 
batch BLAST analysis (24) for related sequences in Viridiplantae, 
with an E value cutoff of 1 x 10~ ,w . The cutoff was chosen based 
on multiple phylogenetic analyses using data sets generated from 
cutoff E values of 1 x 10" 20 , 1 x lO" 10 , and 1. All known RLKs 
were recovered at 1 x 10~ 20 ; therefore, a more relaxed criterion, 
1 X 10" l0 , was used to retrieve all potentially related genes. The 
search results were merged, and redundant sequences were deleted. 
As of February 2001, more than 900 non redundant candidates of 
plant RLKs or related kinases were present in GenBank, and they 
were used for subsequent phylogenetic analysis. For a complete list 
of genes in the RLK/Pelle gene family, see supporting information, 
which is published on the PNAS web site, www.pnas.org. The gene 
name or accession numbers for RLKs shown in the manuscript are 
as follows: ARK2 (AAB33486), At2gl5300 (AAD26903), 
At2gl9130 (AAD12030), At2g24370 (AAD18110), At2g33580 



This paper was submitted directly (Track 10 to the PNAS office. 

Abbreviations: RLK. receptor-like kinase; RLCK, receptor-like cytoplasmic kinases; ePK, 
eukaryotic protein kinase; RTK, receptor tyrosine kinase; RSK, receptor serine/threonine 
kinase; APM(3')lll, aminoglycoside kinase III; EST, expressed sequence tag; TGF-fl, trans- 
forming growth factor-0; LRR, leucine-rich repeat. 
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(AAB80675), At2g45340 (AAB82629). At4gll480 (CAB82153), 
At4g26180 (CAA18124), At4g39110 (CAB43626), BRI1 
(AAC49810), CLV1 (AAF26772), ERECT A (AAC49302), 
HAESA (CAB79651), PR5K (AAC49208), PERK (AAD43169), 
RPK1 (AAD11518), RKF1 (AAC50043), RKF3 (AAC50045), 
RKL1 (AAC95351), TMK1 (JQ1674), WAK1 (CAA08794), 
F1P2.130 (CAB61984), F4I18.11 (T02456), F13F21.28 
(AAD43169), F15A17. 170 (CAB86081), F17J16.160 (CAB86939), 
FI8L15.120 (CAB62031), F23E13.70 (CAA18124), F23M19.11 
(AAD39611), F27K19.130 (CAB80791), MLD14.2 (BBA99679), 
and T20LI5.220 (CAB82765). 

Representatives of the eukaryotic protein kinase (ePK) superfam- 
ily. Based on Hanks and Hunter (22), plant and animal sequences 
from each ePK family were chosen. Plant kinases that seemed to be 
unique to plants were also included in this study (23). Their 
accession numbers are as follows. Arabidopsis sequences: CDC2a 
(AAB23643), CPK7 (AAB03247), CKU (CAA55395), CKA1 
(BAA01090), AME2 (BAA08215), MKK3 (BAA28829), MEKK1 
(BAA09057), NAK (AAA18853), NPH1 (AAC0I753), PVPK-like 
PK5 (BAA01715), CTR1 (AAA32779), MRK1 (BAA22079), 
S6K-like PK1 (AAA21142), GSK30 (CAA64408), GSK3t 
(CAA68027), SnRK2-like PROKINa (AAA32845), and Tousled 
(AAA32874); human sequences: CaMKl (NP003647), CDK3 
(NP001249), CKlal (NP001883), CK2a (CAB65624), GRK6 
(P43250), RK (Q 15835), Hunk (NP055401), CLK1 (P49759), 
MAPK10 (P53779), MAPKK1 (QO2750), MAPKKK1 (Q13233), 
cAPK (P17612), Rafl (TVHUF6), c-SRC (P12931), TLK1 
(NP036422), and TTK (A42861 ). 

Animal receptor kinases. One representative human receptor 
tyrosine kinase sequence was selected from each RTK subfamily 
(1, 2) as follows: AXL (NP001690), DDR (Q08345), EGFR 
(P00533), EPH (P21709), FGFR2 (P21802), HGFR (P08581), 
IR (NP000199), KLG-like PTK7 (AAC50484), LTK (P29376), 
MuSK (AAB63044), PDGFR/3 (PFHUGB), RET (S05582), 
RYK (137560), TIE (P35590), TRKa (BAA34355), and VEGFR 
(PI 7948). Human TGF-0 receptors (TGF/3R I, P36897; TGF/3R 
II, P37173) were chosen as animal representatives of receptor 
serine/threonine kinases. 

Sequence Annotation, Alignment, and Phylogenetic Analysis. Delin- 
eation of structural domains. Structural domains of all sequences 
were annotated according to SMART (25) and Pfam (26) 
databases. The receptor-like kinase configuration was deter- 
mined by the presence of putative signal sequences and extra- 
cellular domains. Sequences without signal sequences, trans- 
membrane regions, or putative extracellular domains were also 
included in the analysis. The kinase domain sequences delin- 
eated initially according to sequence prediction databases were 
modified to include missing or exclude excessive flanking se- 
quences according to the subdomain signature of eukaryotic 
kinases (22). 

Alignment of sequences. The sizes of the kinase domains range 
from 250 to 300 aa. These sequences were compiled and aligned 
by using CLUSTALX (27). The weighing matrices used were 
BLOSUM62orPAM250with the penalty of gap opening lOand 
gap extension 0.2. The alignments generated by these two scoring 
tables are similar to each other and were manually adjusted 
according to the subdomain signatures of eukaryotic kinases 
(22). The alignment for all 610 RLK family members is provided 
as supporting information. 

Optimality criterion and PAUP program parameters. The aligned 
sequences were analyzed with paup (29) based on the Neighbor- 
Joining method (28), minimal evolution, and maximum parsi- 
mony criteria. To obtain the optimal trees, bootstrap analyses 
were conducted with 100 replicates using the heuristic search 
option. Two character-weighing schemes used were (i) all char- 
acters of equal weight and (it) consider the number of nucleotide 
changes required to change from one amino acid to the other. All 



Table 1. The proportion of EST records representing RLK/Pelle 
homologs in various organisms 



Organism 


Total EST* 


RLK homologs 


%EST 


Porphyra yezoensis 


10,185 


0 


0 


C. elegans 


109,095 


0 


0 


D. melanogaster 


95,211 


3 


0.003 


Chlamydomonas 


55,860 


0 


0 


reinhardtii 








Marchantia polymorpha 


1.307 


1 


0.077 


Mosses 


9,159 


19 


0.207 


Ceratopteris richardii 


2,838 


7 


0.247 


Pinus taeda 


21.797 


100 


0.459 


Arabidopsis thaliana 


112,467 


620 


0551 


Glycine max 


122,843 


704 


0.573 


Lotus japonicus 


26.844 


135 


0.503 


Lycopersicon esculentum 


87,680 


526 


0.6 


Oryza sativa 


62,390 


185 


0.297 


Triticum aestivum 


44,132 


178 


0.403 


Zea mays 


73,965 


135 


0.183 



♦The searches were condurted based on EST available from GenBank as of Dec. 
1 5. 2000. 



other parameters for paup were the default values. Because of 
the difficulty in aligning kinase subdomain X, two character sets 
were defined with or without kinase subdomain X sequences. 

Tree rooting and display. Aminoglycoside kinase (APH(3')III) 
from Staphylococcus (P00554) (30) and the Arabidopsis homolog 
of RIOl family kinases (S61006) (31) were used asoutgroups in 
this study. In all analyses, the rooting based on either sequence 
gave the same results. The numbers associated with each branch 
represent the bootstrap support, and branches with less than 
50% support are collapsed. 

Identification of Sequences Representing RLK Homologs. Genomic 
sequences. The kinase domain protein sequences of CLV1 and 
NAK were used to conduct blast searches against the genome 
sequences of Saccharomyces cerevisiae, Caenorhabditis elegans, 
Drosophila melanogaster, and human. The genomic sequence hits 
with an E value smaller than 1 x 1 0~ 10 were included for further 
analysis. Phylogenetic trees were constructed with the candidate 
sequences and the eukaryotic protein kinase representatives. 
Sequences that fell into the same clade as RLKs and had more 
than 50% bootstrap support were regarded as RLK homologs. 
The sequences shown in this analysis are Caenorhabditis Pelle- 
like sequence (CePelle, T23534), Drosophila Pelle (DmPelle, 
Q05652), and human IRAKI (NP001560). 

EST sequences. CLV1 and DmPelle kinase domain sequences 
were used to conduct blast searches against the EST records of 
organisms listed in Table 1. All EST sequences with E values 
smaller than 1.0 were retrieved for further analysis. The se- 
quences with E values smaller than 1 X 10" 50 were regarded as 
RLK homologs. The rest of the sequences longer than 300 
nucleotides were submitted for batch blastx searches against 
Arabidopsis polypeptide records in GenBank. These sequences 
were regarded as RLK homologs if the top five matches of the 
blast outputs were RLK family kinases. 

Results 

The Diversity of RLKs in the Arabidopsis Genome. As the Arabidopsis 
genome sequencing effort approached completion, we con- 
ducted a genome-wide survey of the RLK gene family to gain 
more understanding of its size and complexity. The kinase 
domains of 22 different plant RLKs with various extracellular 
domains were used to search for similar sequences in GenBank 
polypeptide records of Viridiplantae, including all land plants 
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Name 



Structural 
Organization 



Subfamilies with 
Similar Organization 



Extracellular Uimia in TM Kinase domain 




RI.CK I -XI 
TAK-likc 
PERK-like 
RKF3-like 
CrRLKI-like 
LRKKKIike 1.2 
Extensin 
LRR XII 
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Fig. 1. Domain organization of representative RLKs and RLK-subfamily 
affiliations. Based on the presence or absence of extracellular domains, mem- 
bers of this gene family are categorized as RLKs or RLCKs. The gray line 
indicates the position of the membrane-spanning domain. The signal peptides 
are presumably absent in mature proteins but are displayed to demonstrate 
their presence in the RLKs. Locus names or MAtDB gene names are provided 
for the RLK representatives. Domain names are given according to SMART and 
Pfam databases (25, 26). Subfamilies are assigned based on kinase phytogeny 
(see supporting information for subfamily assignments for all members of the 
Arabidopsis RLK/Pelle family) and are shown according to the domain orga- 
nization of the majority of members in a given subfamily. Subfamilies with 
>30% of members in more than one major extracellular domain category are 
designated with asterisk. DUF. domain of unknown function; EGF, epidermal 
growth factor; C lectin, C-type lectin; L-lectin. legume lectin; PAN, plasmino- 
gen/apple/nematode protein domain; TM, transmembrane region; TNFR, 
tumor necrosis factor receptor. 



and algae. With the cutoff E value of 1 X 10 _, °, more than 900 
nonredundant sequences were retrieved. The most recent survey 
of the completed Arabidopsis genome revealed 620 sequences 
related to RLKs. Ten of these sequences showed greatest 
sequence similarity to the Raf kinase family. For the remaining 
610 Arabidopsis sequences, 193 did not have an obvious receptor 
configuration as determined by the absence of putative signal 
sequences and/or transmembrane regions (see supporting in- 
formation). The other 417 genes with receptor configurations 
can be classified into more than 21 structural classes by their 
extracellular domains with examples shown in Fig. 1. The sizes 
of these classes varied greatly. The leucine-rich repeal (LRR) 
containing RLKs represented the largest group in Arabidopsis 
with 216 genes. 

To determine whether RLKs with similar extracellular do- 
mains also have similar kinase domains, the polypeptide se- 
quences of the kinase domains of all 620 Arabidopsis genes were 
aligned, and a phylogenetic tree was generated with the Neigh- 
bor-Joining method (28) using APH(3')II1 as outgroup (see 



supporting information for the complete alignment). 
APH(3')Ill is a bacterial gene that is thought to be a distant 
relative of ePK (30). The phylogeny of Arabidopsis kinase 
domain sequences revealed an interesting pattern where the 
sequences clearly fell into distinct clades (see supporting infor- 
mation for the phylogenetic tree). We have tentatively assigned 
these natural groups into 44 different RLK subfamilies based on 
the kinase domain phylogeny (see supporting information for the 
subfamily assignment). A noteworthy feature of the pattern 
obtained is that the members within each of the RLK subfamilies 
tend to have similar extracellular domains, indicating that a 
single domain-shuffling event may have led to the founding of 
each of the various RLK subfamilies. For example, the diverse 
LRR-containing RLKs fell into more than 13 subfamilies based 
on kinase-domain phylogeny. With few exceptions, the pattern 
obtained is consistent with the grouping based on the structural 
arrangement of LRRs and the organization of introns in the 
extracellular domains of the individual RLKs (data not shown). 
Phylogenetic trees were also generated using minimum evolution 
and maximum parsimony criterions. The results were similar to 
phylogeny generated with the Neighbor-Joining method (data 
not shown). 

The Relationship Between RLKs and Other Families of Protein Kinases 
from Arabidopsis. Despite the similar domain organization be- 
tween different plant RLKs, the phylogenetic relationships 
among members of this family have not been thoroughly studied. 
Members of the RLK family could have arisen independently 
multiple times from distinct families of ePKs. Alternatively, they 
could have originated from a single ePK family and have a 
monophyletic origin. To address this question, we conducted a 
phylogenetic analysis by using the kinase domain amino acid 
sequences of representative RLK sequences from each RLK 
subfamily and representatives from different ePK families found 
in Arabidopsis. 

In the phylogeny based on minimal evolution criterion, all 
RLK representative sequences from Arabidopsis formed a well 
supported clade, indicating that RLKs have a monophyletic 
origin within the superfamily of plant kinases (Fig. 2). In addition 
to RLK sequences, this monophyletic group also included ki- 
nases with no apparent signal sequence or transmembrane 
domain, and they were collectively named receptor-like cyto- 
plasmic kinases (RLCKs, Fig. 1). Some of these kinases formed 
subfamilies distinct from other RLKs, whereas others were 
embedded within several different RLK subfamilies. To deter- 
mine whether the monophyletic grouping of the RLK family 
represented a bias because of the exclusive use of Arabidopsis 
sequences, an extended analysis was conducted using RLK 
sequences from plants other than Arabidopsis. The sequences 
analyzed all fell into the same clade as Arabidopsis RLKs (data 
not shown). 

Among the ePK families found in Arabidopsis, Raf kinases 
were paraphyletic to the RLK family and, together with RLKs, 
formed a well supported group with a bootstrap value of 98% 
(Fig. 2). Based on the parsimony criterion, the support for the 
RLK family and Raf kinases as a monophyletic group was still 
high at 86% (data not shown). Taken together, these results 
indicated that Raf kinases are the closest relatives to RLKs 
among the Arabidopsis sequences analyzed. 

The Relationships Between Animal Receptor Kinases and Plant RLKs. 

Animal RTKs and receptor serine/threonine kinases (RSKs) are 
other families of ePKs with a domain organization similar to that 
of the plant RLKs. To determine the relationships among these 
receptor kinase families, we analyzed the phylogenetic relation- 
ships between the kinase domain sequences of representative 
Arabidopsis RLKs and animal receptor kinases. Arabidopsis and 
human representatives of other ePK families were also included. 



Shiu and Bleecker 



PNAS | September 11. 2001 | vol.98 | no. 19 | 10765 



6^- 



APH3 

-MAPK MPfCl 
-CDK CDC2* 
-CDPK CPK7 
- SnRK2 PROKJNA 



A 



-CKI CKI1 



— CK.2 OCA I 

— LAMMER AME2 

— MAPKK MKKJ 



-MAPKXK M£KKI 

NPHI 

PVPK PK5 

— S6K PKI 
PPK1 PPKI 



9 ?t 



IQO r- 
92\ L 



L Shaggy GSK3ioia 
-TOUSLED 

lUfCTRI 

- Raf MR K I 

NAK 

ARJC2 

Al4g||480 

RKP1 

WAKI 

F15A17.I70 

FI8LI5.120 

-RPKI 



-F23MI9.I J 

— RK.LI 

At2gl5300 

-TMKI 
-BRJI 

— ERECT A 

— CLVI 

F1P2.130 

F23EI3.70 

Al2g33580 

-At4g391IO 

— F27K19.I30 



— PERK 
PR5K 

— Al2gI9l30 

— At2g24370 



-FI7J16.160 
— MLDI4.2 
At2g45340 



RLCK VII 
S-domain 4 
DUF26 
LRR VUI-2 
WAKL 
L-tectin 
LRR I 

Not assigned 
LRR II 
LRR III 
LRR III 
LRR IX 
LRR X 
LRR XIU 
LRR XI 
LRR XU 
LRR VII 
LysM 
CrRLKIL 
Crinkly4L 
RXF3L 
PERKL 
Thaumatin 
S-domain II 
RLCX IX 
LRR Vlll-1 
TAKL 
Not asaigned 
LRR IV 



Fig. 2. Arabidopsis receptor-like kinases and related kinases form a mono- 
phyletic group distinct from all of the other eukaryotic protein kinases found 
in the Arabidopsis genome. The tree was generated with the kinase domain 
sequences of representative Arabidopsis ePKs and RLKs using APH(3')m as 
outgroup based on minimal evolution. The bootstrap values are shown at the 
nodes. The boxed region represents the receptor-like kinase family. The 
arrowhead indicates the RLK subfamily. The abbreviations used are according 
to Fig. 1 . 



The phyiogenetic tree generated based on minimal evolution 
criterion is shown in Fig. 3. All 16 RTK subfamily representatives 
and c-SRC formed a well supported group, indicating a mono- 
phyletic origin for tyrosine receptor kinases. The sister groups to 
the RTK family were Raf kinases. Plant RLKs included in this 
analysis formed another monophyletic group, indicating that 
RLKs have a distinct origin from that of Raf kinases and animal 
RTKs. Plant RLKs, Raf kinases, RSKs, and RTKs collectively 
formed a well supported group with a bootstrap support of 84%. 
The monophyly of kinases in this group when compared with the 
other ePK families was also supported by analyses based on 
maximum parsimony (data not shown). However, the specific 
relationships between animal RSKs, RTKs, Raf kinases, and 
plant RLKs were less conclusive because different optimality 
criteria gave inconsistent results. To investigate whether the 
results obtained were biased by using only human sequences, we 
conducted an extended analysis including RTK and RSK se- 
quences from Caenorhabditis, Drosophila, sponge, and hydra, 
and we reached the same conclusion (data not shown). Based on 
these analyses, we defined the monophyletic group that contains 
the RLK, RTK, RSK, and Raf kinase genes as the receptor 
kinase group. 

Homologs of Plant RLKs in Eukaryotes. To determine whether 
members of the RLK family are present in organisms other than 
flowering plants, we first used the kinase domain sequence of 
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Fig. 3. Human receptor kinases and Arabidopsis receptor kinases belong to 
distinct but related families, and Pelle kinases are the animal homologs of 
Arabidopsis RLKs. {A) Plant and animal representatives of ePKs were used in 
this analysis. The tree is rooted with APH(3')W based on minimal evolution. It 
indicates that Raf, RSK, RTK, and RLK form a well supported group distinct 
from all other ePKs (boxed region). The bootstrap values are shown at the 
nodes. Animal Pelle kinases (shaded area) are found in the same clade as RLKs. 
(B) The proposed evolutionary relationships between receptor kinase family 
members are as follows: 1, an ancient duplication event leading to the 
divergence of RLK/Pelle from RTK/Raf; 2, a more recent gene duplication 
leading to the divergence of RTK from Raf; and 3, the divergence of plant and 
animal lineages, resulting in the ancestral sequences that gave rise to the 
extant receptors and related kinases. 



CLVI to search for homologous sequences in the genomes of 
yeast, C elegans, D. melanogaster, and human. No RLK homolog 
was found in the yeast genome. Five animal homologs of the 
RLK family were found: the Pelle kinase (Dm Pelle) in Drosoph- 
ila (40), the Pelle-like kinase (CcPelle) in Caenorhabditis 
(T23534), and three IRAKs in human (32). DmPelle, CePelle, 
and IRAKI are all cytoplasmic kinases and all fell into the same 
clade as plant RLKs with strong bootstrap support (Fig. 3, 
shaded area). A similar search using other RLK kinase se- 
quences yielded the same results (data not shown). Based on this 
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analysis, we defined the clade containing the plant RLKs and 
Pelle-like sequences as the RLK/Pelie family. 

To broaden the scope of the searches, we used the amino acid 
sequences of CLV1 and Pelle kinase domains to search the EST 
database for RLK homologs in 20 different eukaryotes (15 
shown in Table 1). Sequences with an E value of less than 1 x 
lO - ™, a conservative criterion, were regarded as RLK homologs 
without further examination. The remaining sequences were 
subjected to blast searches and were treated as RLK homologs 
if the top five matches were known R LKs or R LK homologs. The 
results of EST searches are shown in Table 1. All seven of the 
flowering plants, including four dicots and three monocots, have 
0.18% to 0.6% of their ESTs representing RLK/Pelle family 
members. Pines, ferns, and mosses all have a percent EST 
representation similar to that of flowering plants. With the 
exception of the three ESTs representing Drosophila Pelle 
kinase, no other organism examined produced ESTs, which 
could be classified as RLK/Pelle family members. 

The Distribution of RLKs on Arabidopsis Chromosomes. The size 
discrepancy of the RLK/Pelle family between plants and ani- 
mals raises the question on how the expansion of this family 
occurred in the plant lineages. To address this question, we 
examined the location of RLKs on the Arabidopsis chromo- 
somes. After comparing the location of genes to the phytogeny 
based on kinase domains, we found that subfamilies differed in 
their chromosomal distributions. At one extreme, 35 of the 40 
members of the DUF26 subfamily were located on chromosome 
4 (Fig. 4A). At the other extreme, 51 genes representing LRR X, 
XL and XIII subfamilies were distributed among all five chro- 
mosomes (Fig. 4B). In addition, we found that more than 30% 
of the RLK/Pelle family members in Arabidopsis are in tandem 
repeats with 2 to 19 genes. A closer look at the location of the 
38 DUF26 subfamily members on chromosome 4 (including 
three potential pseudogenes) indicates that 34 of them are in 
tandem repeats (Fig. 4C). The phylogenetic relationships be- 
tween DUF26 genes in the tandem repeats indicates that at least 
one intrachromosomal duplication event occurred in the region 
containing tandem repeats. Taken together, the results suggest 
that tandem duplication events and large-scale duplications of 
chromosomes are two of the potential mechanisms responsible 
for the expansion of the RLK/Pelle family in Arabidopsis. 

Discussion 

Evolutionary History of the Receptor Kinase Group. Plant R LKs were 
originally grouped into a single family based on their configu- 
ration as transmembrane kinases with serine and threonine 
specificity. Our analysis provides a phylogenetic basis for the 
classification of RLKs as a single family in the eukaryotic protein 
kinase superfamily. Interestingly, 24% of the 610 Arabidopsis 
genes in the RLK/Pelle family analyzed do not have an extra- 
cellular domain based on the absence of signal sequences and 
transmembrane regions. Some of these apparently cytoplasmic 
kinases form unique subfamilies, whereas others are most closely 
related to kinases with a receptor topology. The latter may 
represent ancestral forms that were recruited into the receptor 
kinase configuration by domain fusion events. Alternatively, 
some of the soluble kinase forms could be derived from ancestral 
receptor kinase forms. In any case, it is apparent that kinase 
domains from the RLK/Pelle family were recruited multiple 
times by fusion with different extracellular domains to form the 
subfamilies found in Arabidopsis. This notion can be expanded to 
include the animal RTK and RSK families in the receptor kinase 
group, which appear to have been formed by recruitment of 
kinases from the same lineage, distinct from all other ePK 
families. 

Based on the kinase domain phylogeny, a hypothetical se- 
quence of events that occurred in the evolution of the receptor 
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Fig. 4. Distribution of RLKs on Arabidopsis chromosomes provides clues for the 
mechanisms of RLK family expansion. (A) The cladogram of the DUF26 subfamily 
was generated with the kinase domain sequences based on minimal evolution 
criterion. The color coding on branches indicates the chromosome on which each 
gene in the subfamily is located. Note that most DUF26 members are located on 
chromosome 4. (B) The cladogram of LRR X, XI, and XIII subfamilies was generated 
and color-coded in the same manner as A. Note that most genes derived from 
duplication events are located on different chromosomes. (O A detailed depic- 
tion of DUF26 distribution on chromosome 4 indicates that tandem duplications 
and an internal chromosomal duplication may contribute extensively to the 
expansion of this subfamily. The 10-kb legend is for the expanded region showing 
tandem repeats. The regions with postulated chromosomal duplications are 
color-coded according to their similarity to regions on the other chromosomes. 
The color-coding scheme is the same as A. Three potential DUF26 pseudogenes 
are also included in the diagram. 



kinase group is proposed in Fig. 3B. According to this model, an 
early gene duplication event led to the founding of two lineages 
that diversified into the RTK and Raf families on one hand and 
the RLK/Pelle family on the other. This diversification seems to 
have occurred before the divergence of plants and animals. In 
addition, both lineages contain representatives of soluble kinase 
and transmembrane receptor forms. It should be noted that the 
soluble Pelle-like and Raf kinases form complexes with cell 
surface receptors and are responsible for transduction of signals 
to downstream effectors (33, 34). Perhaps the continual recruit- 
ment of this particular lineage of kinase modules was favored 
during evolution because ancestral forms had already specialized 
in mediating signaling from transmembrane receptors. Exami- 
nation of kinases belonging to the receptor kinase group in more 
primitive eukaryotes may be informative. Whereas fungi such as 
yeast and Neurospora do not appear to have representatives of 
the receptor kinase group, the slime mold, Dictyostelium discoi- 
deum y has several examples (data not shown). None of these 
sequences from slime mold has predicted signal peptide or 
transmembrane regions, and most of the sequences are dual 
specificity kinases based on their kinase activities (35), consistent 
with the possibility that the ancestral form for extant receptor 
kinases may have been soluble kinases. 
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Diversification of the Plant RLK/Pelle Family. The small number of 
representatives of the RLK/Pelle family in animals compared with 
the much larger number in Arabidopsis indicates that the expansion 
of the plant RLKs occurred after the divergence of plant and 
animal lineages or that massive gene loss occurred in the animals. 
A comparison of EST representation with the known total number 
of RLK/Pelle members in the fully sequenced genomes of C 
elegans, D. melanogaster, and Arabidopsis indicated that the EST 
representation provided a conservative estimate of the total num- 
ber of family members in the genomes. The lack of RLK/Pelle 
ESTs in Porphyra and CMamydomonas argues that, rather than 
massive gene loss in the animal genomes examined, the RLK/Pelle 
family likely underwent expansion after the divergence of animal 
and plant lineages. Interestingly, all land plants have similar percent 
representations of RLK/Pelle kinases, suggesting that the size of 
this gene family may have been similar to the present-day level 
before the diversification of the land plant lineages. Additional 
sequence information will be necessary to determine whether all 
RLK subfamilies found in Arabidopsis are equally represented in 
these other land plant lineages. The early expansion of the RLK/ 
Pelle family could be associated with evolution of multicellularity, 
as has been suggested for the RTK family in animals (36). Alter- 
natively, the expansion of the family could be associated with the 
development of the complex array of attributes required for the 
migration of plant lineages from the aquatic to the terrestrial 
environment. Examination of RLK/Pelle representation in multi- 
cellular green algae such as Chara could help to resolve this 
question. 

The monophyletic origin of the RLK/Pelle family implies that 
the expansion of the family to its present size in Arabidopsis was 
the result of multiple gene-duplication events. Two possible 
mechanisms for the amplification of this family are suggested by 
the way members of some subfamilies are distributed on the 
Arabidopsis chromosomes. For example, the DUF26 subfamily is 
organized in tandem arrays (Fig. 4C). These tandem arrays were 
likely generated by gene duplications resulting from unequal 
crossing-over as seen in the other gene families such as disease 
resistance genes (37). Gene duplication is also driven by larger 
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scale duplication events, including polyploidization followed by 
reshuffling of chromosomal regions (5, 38, 39). Tandem arrays 
of DUF26 members are located in such duplicated regions on 
chromosome 4. However, the localization of other DUF26 
subfamily members almost exclusively on chromosome 4 suggests 
that this subfamily expanded after the extensive chromosome 
duplications and reshuffling identified for multiple regions of all 
five Arabidopsis chromosomes (5, 38, 39). 

On the other hand, members of LRR X, XI, and XIII 
subfamilies are distributed among all five chromosomes, with 
related genes on each branch of the phylogenetic tree generally 
located on different chromosomes. These three related subfam- 
ilies are of particular interest because they include Arabidopsis 
RLKs with known developmental functions such as BRI1, 
CLV1, ERECTA, and HAESA (6, 7, 10). The difference in 
distribution patterns between the DUF26 and these LRR sub- 
families could indicate that the LRR subfamilies originally 
expanded by mechanisms that did not include localized (e.g., 
tandem) duplications. Given sufficient evolutionary time, sev- 
eral rounds of polyploidization followed by chromosomal rear- 
rangements could produce a given subfamily of the observed size 
from a single prototypical gene. Alternatively, these LRR sub- 
families may have originally expanded via localized duplications 
that occurred early enough in evolutionary time that extensive 
chromosome reshuffling could have eliminated linkage between 
subfamily members. Both proposed mechanisms imply that the 
LRR X, XI, and XIII subfamilies may have expanded much 
earlier in time than the DUF26 subfamily. A comparative 
analysis of RLK subfamilies in other plant lineages should help 
to resolve this issue. 
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Gene Coding for Protein Contro lling Morphogenesis 

Plants 



of 



0001 

Relevant area of industry 

This invention relates to genetic DNA that controls the 
morphogenesis of plants, and to DNA that codes for 
antisense RNA for such genes, and to plants whose 
morphology is transformed by such DNA, and provides a 
technique for adjusting the stem lengths and inflorescent 
15 forms of plants. 

0002 

Prior art 

It is believed that plant morphology is affected by 
genetic factors and environmental factors. Hitherto, one 
method of creating short stemmed plants and plants of 
different inflorescent forms was to vary the morphology 
of useful varieties by genetic hybridization with plants 
of different morphology, but it was difficult to 
consistently obtain individuals with superior 
characteristics to those of the new variety. 



0003 



Moreover, variations occur in genes affecting morphology 
through sudden spontaneous variation and through induced 
sudden variation and individuals can be selected that 
exhibit morphological change through reduced genetic 
function, but it was very difficult to deliberately 



create individuals in which variations occurred which 

retained unaltered the useful acquired characteristics 

simply through genes that controlled specific 
morphogenesis . 



0004 

Thus, it is believed that, were it possible to isolate 
the genes that control plant morphogenesis, it would be 
possible to create plants with reduced stem lengths by 
transforming the plant morphology by incorporating in the 
genes a vector that would express antisense RNA 
(antisense RNA expression vector) . 



0005 

in Shiroi Nunazuna (Rrabldopsis thaliana) , elongation of 
the floral buds accompanies the change from the 
vegetative stage to the reproductive stage, and the 
growth of the buds is strongly associated with the 
elongation of the subsequent internodes, displaying a 
highly ordered branching pattern. The Landsberg erecta 
strain which is known as the standard ecotype of 
Arabidopsis thaliana retains the endogenous erecta 
variation and exhibits different floral buds. The 
flowers form dense, compact floral buds in the crown. 
The variation is pleiotropic and possesses round leaves 
and short flat siliques (the above is an abstract 
published at the UCLA Keystone Symposium, Huang, I. et 
al., (1991)). 



0006 

Strains possessing variations at the same gene locus as 
the variation of the Landsberg erecta strain have yielded 



3 



dwarf variations known to genetics, which have been named 
the er-101 strain, er-102 strain and er-103 strain. 

0007 

5 Problem to be solved by the invention 

It has been shown that a correlation between the dwarfing 
of plants and specific genes is to a certain extent 
known, but the gene itself has not yet been found. 

10 0008 

The present invention was developed from this knowledge, 
and it is an objective of the invention to provide a gene 
that controls plant morphogenesis. Moreover it is an 
objective of the invention to provide plants that possess 
stems of greater or less length, or plants in which the 
inflorescence is altered. 

0009 

Means employed in order to solve the problem 

in order to achieve this objective, the inventors of the 
present invention cloned the gene that controls the 
morphogenesis of the plant from the chromosomal DNA of 
Arabidopsis thaliana, and having obtained an antisense 
expression vector in which the gene was combined, 
discovered that the morphology of the individual plants 
whose morphology had been transformed by the vector DNA 
was altered and thus arrived at the present invention. 
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0010 

Thus the present invention is of DNA that codes for a 
protein that includes an amino acid sequence that 
possesses the amino acid sequence shown in Sequence 
Number 1 that possesses the action of controlling plant 
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morphogenesis or alternatively the sequence in which one 
or a plurality of amino acid residues that do not affect 
the action of controlling plant morphogenesis are 
substituted, excised or inserted. 



0011 

Moreover, the invention provides DNA that codes for 
antisense RNA that controls the expression of the 
aforementioned DNA. This DNA is further characterized by 
possessing a substantively complementary base sequence in 
at least one portion of the base sequence in Sequence 
Number 1. 



0012 

The inventors furthermore provide plants whose morphology 
has been transformed by the DNA that codes for the 
aforementioned protein, and also provide plants whose 
morphology has been transformed by DNA that codes for the 
aforementioned antisense RNA. 



0013 

The protein coded for by the DNA of the present invention 
is a protein related to the control of the morphogenesis 
of plants, having extensive expression on the stems and 
flowers thereof, which are greatly altered, and in 
particular to the elongation of the stems, the gene that 
codes for the protein transforming the morphology of 
plants, and being expected to promote the elongation of 
stems through the increased expression level of the gene. 



0014 

On the other hand, it may also be anticipated that the 
elongation of the stem of the morphologically transformed 



5 



plant could be controlled through the morphological 
transformation of the plant by the DNA that codes for the 
aforementioned antisense RNA, or in other words, the DNA 
that controls the expression of the gene related to the 
5 control of the morphogenesis of the plant. 

0015 

In these Specifications, 'chromosomal DNA' and 
'chromosomal gene' refer to the DNA included in the 

10 nuclear chromosomes of the plant cells and the gene 
present on such DNA. Moreover, the gene that relates to 
the control of the morphogenesis of plants provided by 
the present invention may be referred to as the 
'morphogene' or the 'gene envisaged by the present 

15 invention' . 

0016 

Mode of implementation of the invention 

The mode of implementation of the invention is described 
20 below. The gene envisaged by the present invention may 
be acquired from plants that possess the variation 
relating to the variation of expressed morphology by the 
isolation of the gene relating to such variation of 
expressed morphology. Moreover, wild-type genes may be 
25 acquired from the chromosomal DNA of wild- type plants 
through hybridization with the oligonucleotides prepared 
on the basis of the base sequences of the acquired 
variant gene as a probe, or with pairs of 
oligonucleotides prepared on the basis of the base 
30 sequences of the variant gene as the primer by polymerase 
chain reaction (PCR) . 



0017 
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The preparation of plants that possess the variation 
relating to the control of morphogenesis, the method of 
isolating the gene of the present invention from the 
variant, and the method of use of the gene are now 
5 described in detail. The general methods required for 
gene recombination, such as DNA incision, junction and 
transformation, and the determination of gene base 
sequences and hybridization and so forth, are described 
in the documentation accompanying the commercial enzymes 
10 used in these operations and in Molecular Cloning 
(Maniatis, T. et al., Cold Spring Harbor Laboratory 
Press) . 

0018 

15 <i> Tsolation and identification of gene controlling 
plant morphogenesis 

(1) preparation of plants possessing variation relating 
to the control of morphogenesis 

The mutagenesis method (genetic disruption) method in 
20 which the gene is disrupted at the insertion site through 
the introduction of foreign genes into the plant cell and 
the insertion of chromosomal DNA is employed in order to 
cause variation in the gene that controls morphogenesis 
in plants such as Arabidopsis chaliana. Methods of gene 
25 introduction that may be employed are the use of 
Agrobacterium, electropolation of plant protoplastic 
cells, the polyethylene glycol method and microinjection 
and so forth. The use of Agrobacterium is the most 
effective of these methods for reasons of high 
30 transformation efficiency in Arabidopsis thaliana and 
because few variations apart from the variations due to 
the gene introduction are caused. 
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0019 

Here foreign genes are introduced by the agrobacterium 
infection method employing a binary vector system (vector 
system including a functional replication point with T- 
DNA or B. coli or other microorganisms which are capable 
of introducing DNA into plant cells and preferably a 
marker gene for selecting the plant or microorganism 
cells) into the plant cells and plants are prepared with 
variations in the gene that controls morphogenesis. 

0020 

Wild strains of Arabidopais thaliana are infected by the 
in plant* agrobacterium infection method (Chang, S.S., 
Park, S.K. et al. : Plant J. , 5, No. 4 (1994)) and so 
forth with Ti plasmid-derived binary vectors and are 
harvested after growth for approximately 6 weeks. The 
resulting seeds are implanted in agar-agar cultures 
containing hygromycin and are cultivated. Those 
transformed plants that exhibit hygromycin resistance are 
transplanted into rock wool, and the plants are selected 
by visual inspection for variants (erecta variants) whose 
stem lengths are different from those of those of the 
wild plants. 



25 0021 

The variants acquired in this manner possess the strong 
possibility of variation through the insertion of the 
binary vector into the gene that controls plant 
morphogenesis - 



30 



0022 

(2) isolation of variant gene from morphogenes 
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The variant genes relating to this variation are isolated 
by what is known as the plasmid rescue method from the 
variants exhibiting variant morphology acquired in this 
manner. Thus chromosomal DNA is prepared by the method 
described in Cell, 35 (1983), p. 35 from the variants and 
is excised by means of restriction enzymes and the ends 
within the molecules are linked by self ligation. If the 
resulting ring DNA includes the binary vector into which 
chromosomes have been inserted, the DNA molecules 
function as plasmids capable of autonomous replication in 
E. coll cells and the transformed individuals exhibit 
resistance to markers (such as Ampicillin) . 

0023 

The ring DNA is transformed by means of B. coll and the 
recombinant plasmids are recovered from the transformed 
individuals that are resistant to markers, and 
chromosomal DNA segments that include the binary vector 
and morphogene can be acquired. 

0024 

Alternatively, chromosomal DNA is prepared from the 
variant plants and is excised by means of a suitable 
restriction enzyme and is then linked to a plasmid or 
phage vector, and a chromosome library is prepared 
through transformation of E. coll. Clones are selected 
from the library with T-DNA and the like as probes, and 
clones possessing variant gene fragments can be selected. 

0025 

(3) Isolation of morphogenes from wild type genes 
Chromosomal DNA libraries are prepared from wild type 
Arabidopsis thaliana by employing PI phage vectors and 



the like, and the wild type morphogenes can be isolated 
by hybridization with the variant gene fragments that 
have been acquired in this manner as the probes. Primers 
are prepared on the basis of the base sequences of the 
variant genes, and the gene envisaged by the present 
invention can also be acquired by amplification of the 
wild type genes from the chromosomal DNA of the wild type 
plants by PCR. 



0026 



The base sequence of the DNA fragments containing the 
morphogene of AraMdopsis thaliana derived from the 
practical embodiments described below is shown xn 
Sequence Table Sequence Number 2. This gene contains 27 
exons and 26 introns. 

0027 

The gene envisaged by the present invention as explained 
above contains many introns. The cDNA of the morphogenes 
can be isolated in order to acquire the exon part, that 
is the DNA that codes for the protein that controls 
morphogenesis. The cDNA library can be created by 
extracting the mRNA from the terrestrial tissue of 
AraJbidopsis thaliana, synthesis of the DNA by means of a 
reverse transcription enzyme, insertion of a two-chained 
part into a vector by means of a polymerase reaction, and 
transformation of E. coll and the like. cDNA cloning 
kits are commercially available and may be employed. The 
morphogene cDNA clones are obtained with the chromosomal 
genes as probes from the resulting cDNA libraries. 



0028 
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The base sequence of the cDNA from the practical 
embodiment described below, and the amino acid sequence 
inferred from the base sequence, are shown as Sequence 
Table Sequence Number 1. The translated product of this 
gene exhibits similarity to RLK5 shown in Nature, 345 
(1990), p. 743. Gene RLK5 was isolated as a receptor-like 
protein kinase present in cell membranes, but because it 
was isolated through its similarity to the base sequences 
of known protein kinase genes, the gene and its 
translated product are completely unknown. The pattern 
of expression of the RLK5 gene is different from that of 
the gene controlling morphogenesis and the gene is 
expressed in the roots as well as the terrestrial part, 
and it is believed that the two genes are functionally 
different. 

0029 

<2> Use of morphogene 

The gene envisaged by the present invention is a gene 
associated with the control of plant morphogenesis, and 
particularly with the elongation of the stem, and it is 
anticipated that the elongation of plant stems can be 
promoted through the transformation of plants with the 
gene and an increase in the level of expression of the 



gene 



0030 

In 



Xll order to use the gene envisaged by the present 
invention to transform plants, DMA may be introduced into 
the protoplast by electropolation or alternatively by 
using the Ti plasmid of Agrobacterium. In this case, 
chromosomal genes or cDNA prepared from mRNA may be 
employed as the genes envisaged by the present invention. 
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0031 

on the other hand, it is anticipated that it will be 
possible to control the elongation of the stems of 
transformed plants by transforming plants with DNA that 
expresses antisense RNA that controls the expression of 
the gene envisaged by the present invention, that is, RNA 
that possesses a complementary sequence over the entire 
length or at least in part of the mRNA transcribed from 
the morphogene. 



0032 . 
The DNA that expresses the antisense RNA is acquired by 
linking the antisense chain (chain possessing a 
complementary base sequence to the sense chain (code 
chain) ) or at least portion thereof to the downstream of 
the promoter. In other words, two DNA chains that 
include sequences that are similar to the code chain or 
at least to part thereof are acquired by linking to the 
downstream of the promoter in the reverse direction to 
the direction of the original transcription. The 
antisense chain is acquired from the chromosomal DNA or 
cDNA, but the use of the exon portion is preferred 
because the introns cannot be anticipated to exercise the 
function of controlling the expression of the gene 
envisaged by the present invention when chromosomal DNA 
is used. Moreover, any of the code region, the 5' 
untranslated region or the 3' untranslated region may be 
used as at least portion of the antisense chaxn. 
Moreover, the DNA that codes for the antisense RNA may 
include a sequence that codes for complementary poly-dU 
in the poly (A) chain added to the 3' end of the mRNA in 
addition to the 3' untranslated region. 
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0033 

CaMV 3 5S promoter may be used as the promoter used in the 
present invention. This may be used in the same manner 
as the transformation of plants by the gene envisaged by 
the present invention in the transformation of plants by 
transforming plants by DNA that codes for antisense RNA. 



0034 

Furthermore, the upstream of the coding region for the 
gene envisaged by the present invention includes a region 
that controls the expression of the gene envisaged by the 
present invention. This region is a region including at 
least the sequences represented by Base Numbers 1 to 1752 
in Sequence Number 2, and more specifically, the region 
of Base Numbers 396 to 1752. This region is anticipated 
to be capable of use in the control of the expression of 
genes in plant cells. 

0035 

Practical embodiment 

The present invention is described in greater detail 
below by means of a practical embodiment of the 
invention. 

(1) Preparation of variant plants in relation to the 
control of morphogenesis 

Shiroi Nunazuna (Arabidopsis thaliana) ecotype WS 
(Wassilewskija, purchased from Lehle Seeds) was infected 
in planta with Agrobacterium strain EHA101 (Agrobacterium 
tumefaciens) that possessed pGDW32 binary vector 
(possessing hygromycin resistance gene and ampicillin 
resistance gene, capable of autonomous replication in E. 
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coli cells) derived from Ti plasmids by the agrobacterium 
infection method (Chang, S.S., Park, S.K. et al.: Plant 
J. , 5, NO. 4 (1994) ) . 

0036 

This is described more specifically below. Agrobacterium 
into which pGDW32 had been introduced was proliferated 
overnight in an LB culture containing the antibiotic 
(hygromycin) . 10 /il of this culture was diluted (1/20) 
with 190 /il of Gamborg's B5 culture medium (containing 2% 
sucrose, pH 5.5). After from 19 to 23 days from 
inoculation, agrobacterium treatment was performed at the 
growth stage at which bolting had only commenced. 

0037 

Only those flower stems that had begun to lengthen were 
cut at their bases with a #11 scalpel, and a 26Gxl/2 size 
hypodermic syringe was employed to perforate the rosette 
stem from the incision. 1 0- of dilute Agrobacterium 
solution was injected into the wound. During this time, 
the plants were irradiated with 3000 to 4000 lux. Three 
days after inoculation, the plants were transplanted into 
rock wool fibre mini-pots and were then grown in the 
normal manner. Approximately 6 weeks after inoculation, 
the seeds were harvested from the rapidly formed 
approximately half siliques. 

0038 

The seeds obtained were implanted in B5 agar- agar culture 
containing 10 //g/ml of hygromycin and were grown under a 
12 hours daylight/12 hours night lighting cycle under 
6,000 to 12,000 lux of irradiation at a temperature of 
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22° C and at a humidity of 30% to 40%. HYPONeX 
(manufactured by Murakami Bussan KK) diluted to 1/1000 
was used as the nutrient. The transformed plants that 
possessed pGDW32 T-DNA inserted sequences that exhibited 
hygromycin resistance were transplanted to rock wool and 
grown, and the T4 seeds (fourth generation seeds) were 
collected. The plants were inspected visually at this 
time, and the variants that exhibited different 
elongations of the stems from the wild type plants were 
acquired. 

0039 

(2) Isolation of variant genes 

Arabidopsis genome DNA was prepared by the method 
described in Cell, 35 (1983), p. 35. 

0040 

5 g of Arabidopsis tissue (root) was ground to a fine 
powder in a mortar in liquid nitrogen at -80* C this was 
added to 25 ml of DNA isolation buffer (50 mM tris-HCl, 
P H 7.5, 0.2 M NaCl, 20 mM EDTA-Na2 , 2% N-lauroyl 
sarcosine sodium salt, 3 g/ml urea and 5% TE saturated 
phenol) and stirred, 25 ml of phenol /chloroform was 
added, whereupon 1.5 ml of 10% SDS (sodium dodecyl 
sulphate) was added and the mixture was stirred gently at 
room temperature for 10 minutes. This was centrifuged 
for 10 minutes at 6000 rpm, whereupon 25 ml of 
phenol/chloroform was added to the aqueous layer and the 
solution was centrifuged again for 10 minutes at 6000 
rpm. 15 ml of phenol was added to the aqueous layer and 
the solution was stirred and was then immediately 
centrifuged for 10 minutes at 6000 rpm. The supernatant 
was discarded, 25 ml of 70% ethanol was added to the 
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sediment, the mixture was stirred in a Vortex and was 
then centrifuged for 10 minutes at 6000 rpm, whereupon 
the supernatant was discarded and the residue was dried 
under reduced pressure. The residue was dissolved in 400 
5 /il of TE at a pH of 8.0 (10 ftq/ fi\ RNase) . 

0041 

200 MS of chromosomal DNA prepared in this manner from 
variants with control of morphogenesis was then refined 
10 by the CsCl ultracentrifugation method. 1 ffl of this 
chromosomal DNA was excised by EcoRI or Xbal, the 
restriction enzyme fragments were refined by phenol 
extraction and ethanol precipitation and the ends within 
the molecules were linked by self -ligation; E. coli (XL1- 
15 Blue MRF' (purchased from Stratagene) ) was then 
transformed with the resulting ring DNA. This yielded 
approximately one thousand colonies of a transformed 
strain that exhibited resistance to ampicillin. Plasmxd 
DNA was extracted from the resistant colonies acquired 
20 and analysis was performed. 

0042 

Those of the rescued plasmids that had been excised from 
the DNA by EcoRI were labelled pREa and pREb, and those 
25 that had been excised from the DNA by Xbal were labelled 
pRXa and pRXb. 

0043 

(3) isolation of variant genes 
30 The variant genes were isolated from the chromosomal DNA 
library with the chromosomal DNA fragments containing the 
plasmids rescued in the manner described above as the 
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probes. PI Phage vector (purchased from Du Pont) was 
employed in order to prepare a nuclear DNA library of the 
Arabidopsis ecotype Colombia strain by the method 
described in The Plant Journal, 7 (1995), p. 351. The 
EcoRI and Xbal fragments prepared respectively from pRXb 
and pREa were 32 P labeled and were used as probes in 
order to perform plaque hybridization. Two positive 
clones were obtained as a result; these were labeled 28D7 
and 61H10 respectively. Restriction enzyme maps were 
prepared for these two clones, and it was found that 28D7 
held approximately 25 kb and 61H10 held approximately 75 
^ of inserted fragments derived from Arabidopsis 
chromosomal DNA (Figure 1) . 

0044 

Subcloning of the variant genes contained in the inserted 
DNA was performed in order to determine the base 
sequences. Restriction enzyme maps were prepared of the 
sequences adjacent to the T-DNA and Southern analyses 
were performed of the inserted fragments in order to 
estimate the T-DNA insertion sites on the inserted 
fragments (Figure 2) . The chromosomal DNA sequences in 
the vicinity of the insertion sites were subcloned in 
Bluescript II SK + . The sites on the inserted fragments 
of the resulting subclones are shown in Figure 3 . 



0045 

(4) isolation of cDNA clones of genes related to the 
control of morphogenesis 

mRNA was prepared from the terrestrial portion tissue of 
the Arabidopsis ecotype Colombia strain, and cDNA was 
prepared by the molecular cloning method (Sambrook, J-, 
Fritsch, E.F. and Maniatis, T. (1989): A Laboratory 
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Manual , Second Edition, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY) , and a cDNA library was 
prepared by using XYES (purchased from Clontech) as the 
vector. The sequences adjacent to the T-DNA were excised 
from the aforementioned pRXb and pREa and were 32 P 
labeled. These were used as the probes when plaque 
hybridization was performed on 300,000 plaques of the 
phage library. The selected plasmid was pKUT161. 

0046 

(5) Analysis of gene related to the control of 
morphogenesi s 

The base sequences of the cDNA clone contained in pKUT161 
and the chromosomal gene were analyzed. The amino acid 
sequence was inferred from the cDNA base sequence and 
this sequence is shown in Sequence Table Sequence Number 
1. The base sequence of the DNA fragment including the 
chromosomal gene is shown in Sequence Number 2. 

0047 

When the analysis of the amino acid sequence inferred 
from the cDNA base sequence was performed, the translated 
product of the gene was found to exhibit similarities to 
the RLK5 described in Nature, 345 (1990) p. 743. The RLK5 
gene was isolated as a receptor- like protein kinase 
present on the cell membrane, but because it was isolated 
through its similarity to the base sequences of known 
protein kinase genes, the gene and the translated 
products thereof are entirely unknown. The pattern of 
expression of the RLK5 gene is different from the gene 
related to the control of morphogenesis, the gene is 
expressed not only in the terrestrial part but also in 
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the roots, and the genes are considered by be 
functionally different genes. 

0048 

(6) Analysis of variant related to control of 
morphogenesis 

The variant related to the control of morphogenesis 
acquired at (1) was considered from the elongation of the 
stem to be a variant of the same gene locus as the erecta 
variant of the Landsberg erecta strain. Thus first the 
genetic complementarity of the Landsberg erecta strain, 
the er-103 strain and the variant isolated by the present 
invention was tested. As a result, it was found that all 
these variants arose from a variation in the same gene 
locus and the variant isolated was the er-104 straxn. 

0049 

mRNA was isolated from the Landsberg erecta straxn and 
er-103 strain and was amplified by the RT-PCR (reverse 
transcription - PGR) method. Thus the mRNA formed the 
template and the reverse transcription reaction was 
performed with the oligonucleotide possessing the base 
sequence shown in Sequence Number 4 as the primer; the 
cDNA acquired was then used as the template, and the PGR 
reaction was performed with the aforementioned 
oligonucleotide as the 3' end primer and an 
oligonucleotide possessing the base sequence shown xn 
Sequence Number 3 as the 5' end primer. 

0050 

The resulting amplified product was used as a template 
and the base sequence was determined by the direct 
sequence method using the aforementioned oligonucleotide 
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primer. As a result it was found that in the Landsberg 
erecta strain, the isoleucine residue had been 
substituted by a lysine residue in association with the 
substitution of A for T in Base Number 2299 in Sequence 
Number 1 of amino acid No. 750. In the er-103 strain, the 
methionine residue had been substituted by an isoleucine 
residue in Amino Acid Number 282 due to the substitution 
of A for G in Base Number 896 in Sequence Table Sequence 
Number 1. 

0051 

It was found that in all these variants, portions of the 
base sequence for the gene related to the control of 
morphogenesis possessed variations that gave rise to 
changes in the amino acid sequences, and this gene could 
be identified as the gene that controlled morphogenesis 
(morphogene) . 

0052 

(7) Analysis of expression of morphogene 

As a further method of confirming that the cDNA isolated 
in the manner described above and the chromosomal gene 
was the gene that controlled morphogenesis, analyses were 
made of the extent of expression in each of the variant 
plant tissues, and Northern Analysis was employed in 
order to make a comparative analysis of the extent of 
gene expression between the wild type plants and the 
variant type plants. cDNA clones were used as the 
probes As a result of the Northern Analysis using whole 
individual wild type plants and Landsberg erecta 
variants, the expression of the morphogene (formation of 
mRNA) was observed as strong in the wild type plants, but 
was barely detected in the variants. 
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0053 

Moreover, as a result of analyses of the extent of 
expression of the morphogene in each of the tissues of 
the wild type plant, the extent of expression was high in 
variants in which variation in the stems and flowers and 
so forth was great, and this also confirmed that the gene 
was the gene related to the elongation of the stem. 

0054 

Moreover, as the expression of the morphogene is site and 
time specific as shown above, the expression control 
region for the gene envisaged by the present invention 
can be used to control the expression of exogenous genes 
in plants. The expression control region is a region 
that includes at least portion of the sequence 
represented by Base Numbers 1 to 1752 in Sequence Number 
2, and more specifically, the region of Base Numbers 396 
to 1752. 

0055 

Effects of the invention 

The invention provides a gene that controls the 
morphogenesis of plants. It can be anticipated that the 
elongation of the stems of plants could be promoted by 
increasing the extent of expression of the gene. 
Moreover, it could be anticipated that the elongation of 
the stems of plants could be controlled by transforming 
plants by means of the DNA sequence that expresses the 
antisense RNA for the gene. 



0056 

Sequence Tables 
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Sequence Number: 1 
Sequence length: 3176 
Sequence type: Nucleic acid 
Number of chains: Two 
Topology: Straight chain 
Class of sequence: cDNA t mRNA 

Source . 
Plant name: Shiroi Nunazuna ^idopsis thalaana) 

Strain: Colombia 

Characteristics of sequence: 

Symbol indicating characteristics: CDS 

Sites: 51.. 2978 

CTTTTAAAGT ATATCTAAAA ACGCAGTCGT TTTAAGACTG TGTGTGAGAA ATC GCT 

1 

CTG TTT AGA GAT ATT GTT CTT CTT GGG TTT CTC TTC TGC TTG AGC TTA 
Leu Phe Arg As P lie Val Leu Leu Gly Phe Leu Phe Cys Leu Ser Leu 
5 10 15 



56 



104 
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GTA GCT 
Val Ala 
20 

AAG TCA 
Lys Ser 

35 

CCT tct 
Pro Ser 

ACC TTC 
Thr Phe 

GAA ATC 
Glu lie 



ACT GTG ACT TCA 
Thr Val Thr Ser 



TTC AAA 
Phe Lys 

TCG GAT 
Ser Asp 



CTG CGA 
Leu Arg 
100 
TGT TCT 
Cys Ser 
115 

GAC ATA 
Asp lie 

CTG AAG 
Leu Lys 

ATT CCA 
lie Pro 



AAT GTT 
Asn Val 
70 

TCA CCT 
Ser Pro 
85 
GGT AAT 
Gly Asn 



GAT GTG 
Asp Val 
40 

TAT TGT 
Tyr Cys 
55 

GTT GCT 
Val Ala 



GAG GAG 
Glu Glu 
25 

AAC AAT 
Asn Asn 



GGA GCA ACG 
Gly Ala Thr 



GTC TGG 
Val Tn> 

CTT AAT 
Leu Asn 



GCT ATT 
Ala lie 

CGC TTG 
Arg Leu 



TCT TTG 
Ser Leu 

CCG TTT 
Pro Phe 



GAG ATA 
Glu lie 
180 
TTG CGA 
Leu Arg 
195 

CTG ACT 
Leu Thr 

AGT ATA 
Ser lie 

TTG TCC 
Leu Ser 



AAT AAC 
Asn Asn 
150 
AAC CTG 
Asn Leu 
165 

CCA AGA 
Pro Arg 



CAA AAC 
Gin Asn 
120 
TCG ATT 
Ser He 
135 

CAA TTG 
Gin Leu 



GGA GAT 
Gly Asp 
90 

TCT GGA 
Ser Gly 
105 

TTA GAC 
Leu Asp 



GTT CTT TAT 
Val Leu Tyr 
45 

AGA GGT GTG 
Arg Gly Val 
60 

TTG TCA GAT 
Leu Ser Asp 
75 

CTC AAG AGT 
Leu Lys Ser 



TTG CTG GAG 
Leu Leu Glu 
30 

GAC TGG ACA 
Asp Tn> Thr 



TCT TGT GAA 
Ser Cys Glu 



CAA ATC CCT 
Gin He Pro 



TCG AAG 
Ser Lys 

ATA GGA 
He Gly 



AAA ATT 
Lys He 

CTT ATT 
Leu He 



GGA AAC 
Gly Asn 

GGT CTT 
Gly Leu 



CTG CAA 
Leu Gin 
260 
ATT CCA 
He Pro 



CCT GAG 
Pro Glu 
230 
TAC AAT 
Tyr Asn 
245 

GTT GCA 
Val Ala 



AAC TTA 
Asn Leu 
200 
TGG TAT 
Trp Tyr 
215 

ACG ATA 
Thr He 



CTG GAC 
Leu Asp 
170 
TAC TGG 
Tyr Trp 
185 

GTC GGT 
Val Gly 



TTA TCC TTC 
Leu Ser Phe 
125 

TTG AAG CAA 
Leu Lys Gin 

140 
CCG ATC CCT 
Pro lie Pro 
155 

TTG GCA CAG 
Leu Ala Gin 



TTG AAT CTT 
Leu Asn Leu 
80 

CTC TTG TCA 
Leu Leu Ser 

95 

GAT GAG ATT 
Asp Glu He 
110 

AAT GAA TTA 
Asn Glu Leu 



ATT AAG 
lie Lys 

ACT TCA 
Thr Ser 
50 

AAT GTC 
Asn Val 
65 

GAT GGA 
Asp Gly 



CTT GAG CAG 
Leu Glu Gin 



AAT GAA GTT 
Asn Glu Val 



TTT GAC 
Phe Asp 

GGA AAT 
Gly Asn 



CAG CTA 
Gin Leu 

ACA TTA 
Thr Leu 



TCA GTG ATT GGT 
Ser Val lie Gly 



ACT GGT 
Thr Gly 
250 
TCA TTG 
Scr Leu 
265 

CTC ATG 
Leu Met 



AAC ATT TCT 
Asn lie Ser 
205 

GTA AGA AAC 
Val Arg Asn 

220 
TGC ACT GCC 
Cys Thr Ala 
235 

GAG ATC CCT 
Glu He Pro 



TCA ACA CTT 
Ser Thr Leu 
160 

AAT AAA CTC 
Asn Lys Leu 

175 
CTT CAG TAT 
Leu Gin Tyr 
190 

CCA GAT TTG 
Pro Asp Leu 



ATT GAT 
lie Asp 

GGT GAC 
Gly Asp 

AGT GGT 
Ser Gly 
130 
CTG ATT 
Leu He 
145 

TCA CAG 
Ser Gin 



AAC AGT TTG 
Asn Ser Leu 



152 



200 



248 



296 



344 



392 



440 



488 



536 



CAA GGC AAT 
Gin Gly Asn 



CAA GCC CTT 
Gin Ala Leu 



TTC CAG GTT 
Phe Gin Val 
240 

TTT GAC ATC 
Phe Asp He 

255 
CAA CTC TCT 
Gin Leu Scr 
270 

GCA GTC TTA 
Ala Val Leu 



AGT GGT 
Ser Gly 

CTT GGG 
Leu Gly 

TGT CAA 
Cys Gin 
210 
ACT GCT 
Thr Gly 
225 

TTG GAC 
Leu Asp 



GGC TTC 
Gly Phe 

GGG AAG 
Gly Lys 

GAT CTA 
Asp Leu 



584 



632 



680 



728 



776 



824 



872 



920 
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275 280 285 290 

ACT GGC AAC TTG TTG ACT GGA TCT ATT CCT COG ATT CTC GGA AAT CTT 
Ser Gly Asn Leu Leu Ser Gly Ser lie Pro Pro lie Leu Gly Asn Leu 

295 300 305 

ACT TTC ACC GAG AAA TTG TAT TTG CAC ACT AAC AAG CTG ACT GGT TCA 
Thr Phe Thr Glu Lys Leu Tyr Leu His Ser Asn Lys Leu Thr Gly Ser 

310 315 320 

ATT CCA CCT GAG CTT GGA AAC ATG TCA AAA CTC CAT TAC CTG GAA CTC 
He Pro Pro Glu Leu Gly Asn Met Ser Lys Leu His Tyr Leu Glu Leu 

325 330 335 

AAT GAT AAT CAT CTC ACG GGT CAT ATA CCA CCA GAG CTT GGG AAG CTT 
Asn Asp Asn His Leu Thr Gly His He Pro Pro Glu Leu Gly Lys Leu 

340 345 350 

ACT GAC TTG TTT GAT CTG AAT GTG GCC AAC AAT GAT CTG GAA GGA CCT 
Thr Asp Leu Phe Asp Leu Asn Val Ala Asn Asn Asp Leu Glu Gly Pro 
355 360 365 370 

ATA CCT GAT CAT CTG AGC TCT TGC ACA AAT CTA AAC AGC TTA AAT GTT 
lie Pro Asp His Leu Ser Ser Cys Thr Asn Leu Asn Ser Leu Asn Val 

375 380 385 

CAT GGG AAC AAG TTT ACT GGC ACT ATA CCC CGA GCA TTT CAA AAG CTA 
His Gly Asn Lys Phe Ser Gly Thr He Pro Arg Ala Phe Gin Lys Leu 

390 395 400 

GAA ACT ATG ACT TAC CTT AAT CTG TCC AGC AAC AAT ATC AAA GGT CCA 
Glu Ser Met Thr Tyr Leu Asn Leu Ser Ser Asn Asn He Lys Gly Pro 

405 410 415 

ATC CCG GTT GAG CTA TCT CGT ATC GGT AAC TTA GAT ACA TTG GAT CTT 
He Pro Val Glu Leu Ser Arg He Gly Asn Leu Asp Thr Leu Asp Leu 

420 425 430 

TCC AAC AAC AAG ATA AAT GGA ATC ATT CCT TCT TCC CTT GGT GAT TTG 
Ser Asn Asn Lys He Asn Gly He He Pro Ser Ser Leu Gly Asp Leu 
435 440 445 450 

GAG CAT CTT CTC AAG ATG AAC TTG ACT AGA AAT CAT ATA ACT GGT GTA 
Glu His Leu Leu Lys Met Asn Leu Ser Arg Asn His lie Thr Gly Val 

455 460 465 

CTT CCA GGC GAC TTT GGA AAT CTA AGA AGC ATC ATG GAA ATA GAT CTT 
Val Pro Gly Asp Phe Gly Asn Leu Arg Ser He Met Glu He Asp Leu 

470 475 480 

TCA AAT AAT GAT ATC TCT GGC CCA ATT CCA GAA GAG CTT AAC CAA TTA 
Ser Asn Asn Asp He Ser Gly Pro lie Pro Glu Glu Leu Asn Gin Leu 

485 490 495 

CAG AAC ATA ATT TTG CTG AGA CTG GAA AAT AAT AAC CTG ACT GGT AAT 
Gin Asn He lie Leu Leu Arg Leu Glu Asn Asn Asn Leu Thr Gly Asn 

500 505 510 

GTT GGT TCA TTA GCC AAC TGT CTC AGT CTC ACT CTA TTG AAT GTA TCT 
Val Gly Ser Leu Ala Asn Cys Leu Ser Leu Thr Val Leu Asn Val Ser 
515 520 525 530 

CAT AAC AAC CTC GTA GGT GAT ATC CCT AAG AAC AAT AAC TTC TCA AGA 
His Asn Asn Leu Val Gly Asp He Pro Lys Asn Asn Asn Phe Ser Arg 

535 540 545 

TTT TCA CCA GAC AGC TTC ATT GGC AAT CCT GGT CTT TGC GGT AGT TGG 
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Phe Ser Pro Asp Ser Phe He Gly Asn Pro Gly Leu Cys Gly Ser Trp 

550 555 560 

CTA AAC TCA CCG TGT CAT GAT TCT CGT CGA ACT GTA CGA GTG TCA ATC 
Leu Asn Ser Pro Cys His Asp Ser Arg Arg Thr Val Arg Val Ser He 

565 570 575 

TCT AGA GCA GCT ATT CTT GGA ATA GCT ATT GGG GGA CTT GTG ATC CTT 
Ser Arg Ala Ala lie Leu Gly He Ala lie Gly Gly Leu Val He Leu 

580 585 590 

CTC ATG GTC TTA ATA GCA GCT TGC CGA CCG CAT AAT CCT CCT CCT TTT 
Leu Met Val Leu He Ala Ala Cys Arg Pro His Asn Pro Pro Pro Phe 
595 600 605 610 

CTT GAT GGA TCA CTT GAC AAA CCA GTA ACT TAT TCG ACA CCG AAG CTC 
Leu Asp Gly Ser Leu Asp Lys Pro Val Thr Tyr Ser Thr Pro Lys Leu 

615 620 625 

GTC ATC CTT CAT ATG AAC ATG GCA CTC CAC GTT TAC GAG GAT ATC ATG 
Val He Leu His Het Asn Net Ala Leu His Val Tyr Glu Asp He Met 

630 635 640 

AGA ATG ACA GAG AAT CTA ACT GAG AAG TAT ATC ATT GGG CAC GGA GCA 
Arg Met Thr Glu Asn Leu Ser Glu Lys Tyr He lie Gly His Gly Ala 

645 650 655 

TCA AGC ACT GTA TAC AAA TGT GTT TTG AAG AAT TGT AAA CCG GTT GCG 
Ser Ser Thr Val Tyr Lys Cys Val Leu Lys Asn Cys Lys Pro Val Ala 

660 665 670 

ATT AAG CGG CTT TAC TCT CAC AAC CCA CAG TCA ATG AAA CAG TTT GAA 
lie Lys Arg Leu Tyr Ser His Asn Pro Gin Ser Met Lys Gin Phe Glu 
675 680 685 690 

ACA GAA CTC GAG ATG CTA ACT AGC ATC AAG CAC AGA AAT CTT GTG AGC 
Thr Glu Leu Glu Met Leu Ser Ser He Lys His Arg Asn Leu Val Ser 

695 700 705 

CTA CAA GCT TAT TCC CTC TCT CAC TTG GGG ACT CTT CTG TTC TAT GAC 
Leu Gin Ala Tyr Ser Leu Ser His Leu Gly Ser Leu Leu Phe Tyr Asp 

710 715 720 

TAT TTG GAA AAT GGT AGC CTC TGG GAT CTT CTT CAT GGC CCT ACG AAG 
Tyr Leu Glu Asn Gly Ser Leu Trp Asp Leu Leu His Gly Pro Thr Lys 

725 730 735 

AAA AAG ACT CTT GAT TGG GAC ACA CGG CTT AAG ATA GCA TAT GGT GCA 
Lys Lys Thr Leu Asp Trp Asp Thr Arg Leu Lys He Ala Tyr Gly Ala 

740 745 750 

GCA CAA GGT TTA GCT TAT CTA CAC CAT GAC TGT ACT CCA AGG ATC ATT 
Ala Gin Gly Leu Ala Tyr Leu His His Asp Cys Ser Pro Arg He He 
755 760 765 770 

CAC AGA GAC GTG AAG TCG TCC AAC ATT CTC TTG GAC AAA GAC TTA GAG 
His Arg Asp Val Lys Ser Ser Asn lie Leu Leu Asp Lys Asp Leu Glu 

775 780 785 

GCT CGT TTG ACA GAT TTT GGA ATA GCG AAA AGC TTG TGT GTG TCA AAG 
Ala Arg Leu Thr Asp Phe Gly lie Ala Lys Ser Leu Cys Val Ser Lys 

790 795 800 

TCA CAT ACT TCA ACT TAC GTG ATG GGC ACG ATA GGT TAC ATA GAC CCC 
Ser His Thr Ser Thr Tyr Val Met Gly Thr He Gly Tyr He Asp Pro 
805 810 815 



1784 



1832 



1880 



1928 



1976 



2024 



2072 



2120 



2168 



2216 



2264 



2312 



2360 



2408 



2456 



2504 



25 



CAC TAT GCT CGC ACT TCA CGC CTC ACT GAC AAA TCC CAT GTC TAC ACT 2552 
Glu Tyr Ala Arg Thr Ser Arg Leu Thr Glu Lys Ser Asp Val Tyr Ser 

820 825 830 

TAT GGA ATA GTC CTT CTT GAG CTG TTA ACC CGA AGG AAA GCC GTT GAT 2600 
Tyr Gly lie Val Leu Leu Glu Leu Leu Thr Arg Arg Lys Ala Val Asp 
835 840 845 850 

GAC GAA TCC AAT CTC CAC CAT CTG ATA ATG TCA AAG ACG GGG AAC AAT 2648 
Asp Glu Ser Asn Leu His His Leu He Het Ser Lys Thr Gly Asn Asn 

855 860 865 

GAA GTG ATG GAA ATG GCA GAT CCA GAC ATC ACA TCG ACG TGT AAA GAT 26% 
Glu Val Met Glu Met Ala Asp Pro Asp He Thr Ser Thr Cys Lys Asp 

870 875 880 

CTC GGT GTG GTG AAG AAA GTT TTC CAA CTG GCA CTC CTA TGC ACC AAA 2744 
Leu Gly Val Val Lys Lys Val Phe Gin Leu Ala Leu Leu Cys Thr Lys 

885 890 895 

AGA CAG CCG AAT GAT CGA CCC ACA ATG CAC CAG GTG ACT CGT GTT CTC 2792 
Arg Gin Pro Asn Asp Arg Pro Thr Met His Gin Val Thr Arg Val Leu 

900 905 910 

GGC ACT TTT ATG CTA TCG GAA CAA CCA CCT GCT GCG ACT GAC ACG TCA 2840 
Gly Ser Phe Met Leu Ser Glu Gin Pro Pro Ala Ala Thr Asp Thr Ser 
915 920 925 930 

GCG ACG CTG GCT GGT TOG TGC TAC GTC GAT GAG TAT GCA AAT CTC AAG 2888 
Ala Thr Leu Ala Gly Ser Cys Tyr Val Asp Glu Tyr Ala Asn Leu Lys 

935 940 945 

ACT CCT CAT TCT GTC AAT TGC TCT TCC ATG ACT GCT TCT GAT GCT CAA 2936 
Thr Pro His Ser Val Asn Cys Ser Ser Met Ser Ala Ser Asp Ala Gin 

950 955 960 

CTG TTT CTT CGG TTT GGA CAA GTT ATT TCT CAG AAC ACT GAG 2978 
Leu Phe Leu Arg Phe Gly Gin Val lie Ser Gin Asn Ser Glu 

965 970 975 

TAGTTTTTCG TTAGGAGGAG AATCTTTAAA ACGGTATCTT TTCGTTGCGT TAAGCTGTTA 3038 
GAAAAATTAA TGTCTCATGT AAAGTATTAT GCACTGCCTT ATTATTATTA GACAAGTGTG 3098 
TGGTGTGAAT ATGTCTTCAG ACTGGCACTT AGACTTCCAA AAAAAAAAAA AAAAAAAAAA 3158 
AAAAAAAAAA AAAAAAAA 



3176 



0057 

Sequence Number 2 
Sequence length: 9295 
Sequence type: Nucleic acid 
Number of chains: Two 
Topology: Straight chain 
Class of sequence: Genomic DNA 
Source 

Plant name: Shiroi Nunazuna (Arabidopsis thallana) 
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Strain: Colombia 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 1803.. 1881 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 1882.. 2227 
Characteristics of sequence: 
symbol indicating characteristics 
Sites: 2228.-2366 
Characteristics of sequence: 
Symbol indicating characteristics 
Sites: 2367.-2467 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 2540.. 2643 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 2468.-2539 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 2644.. 2715 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 2716.. 2809 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 2810.. 2878 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 2879.-2968 
Characteristics of sequence: 
Symbol indicating characteristics: 



Exon 



Intron 



Exon 



Intron 



Intron 



Exon 



Exon 



Intron 



Exon 



Intron 



Exon 
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10 



15 



20 



25 



30 



Sites: 2969.. 3040 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3041.. 3118 
Characteristics of sequence: 
Symbol indicating characteristics 
Sites: 3119.. 3190 
Characteristics of sequence: 
Symbol indicating characteristics 
Sites: 3191.. 3266 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites : 3267 . .3338 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3339.. 3421 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3422.-3493 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites : 3494 . .3586 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites : 3587 . . 3655 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3656.. 3740 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3741.. 3812 
Characteristics of sequence: 
Symbol indicating characteristics 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



intron 
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Sites: 3813.. 3888 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3889.. 3960 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 3961.. 4048 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4049.. 4120 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4121.. 4209 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4210.. 4281 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4282.-4349 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4350.. 4421 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4422.. 4508 
Characteristics of sequence: 
symbol indicating characteristics: 
Sites: 4509.. 4580 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4581.. 4706 
Characteristics of sequence: 
Symbol indicating characteristics 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 
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Sites: 4707. .4778 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4779. .4860 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4861. .4932 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 4933.. 5018 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5019.. 5090 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5091. .5176 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5177. .5248 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5249.. 5412 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5413. .5481 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5482.-5576 
Characteristics of sequence: 
Symbol indicating characteristics: 
Sites: 5577 . . 5648 
Characteristics of sequence: 
Symbol indicating characteristics: 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Exon 



Intron 



Sites: 5649. ,5726 

Characteristics of sequence: 

Symbol indicating characteristics: Exon 

Sites: 5727. .5800 

Characteristics of sequence: 

Symbol indicating characteristics: Intron 

Sites: 5801. .5882 

Characteristics of sequence: 

Symbol indicating characteristics: Exon 

Sites: 5883 . . 6011 

Characteristics of sequence: 

Symbol indicating characteristics: Exon 

Sites: 6096.. 6443 

Characteristics of sequence: 

Symbol indicating characteristics: Exon 

Sites: 6012 . . 6095 

Characteristics of sequence: 

Symbol indicating characteristics: Intron 

Sites: 6444 . .6519 

Characteristics of sequence: 

Symbol indicating characteristics: Exon 

Sites: 6520. .6890 

Characteristics of sequence: 

Symbol indicating characteristics: Intron 

Sites: 6891.. 6974 Characteristics of sequence 

Symbol indicating characteristics: Exon 

Sites: 6975. .7328 
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Sequence 



GAATTCAAAG GAATAAGCAT CGGAGACGAT TTAATGTTAC CTCTTGACGT ATTTATCCAA 
TTTATCCATT *££x* CCATAGCATC TGATCATCAT CATCAACATA TAAATAACCA 
AATTTGAAAT GAACAAAAGT CGAATTGGTG ATATTGAAAA TCGAGTTCGT GAAATTGAGA 
TGAATTTGAA GAGAGATGCG TGTACCGTTA GGGAGGAGGA GGAGA^GA 
GAGAAAAAAG GAGACGGAGA TAACTCGCCG GCTCTGTTTC CATGGCGGAG GTGATAATGT 
A^C GTTAGCTTTr TGTGGTTTGA GTTGGAGAAC JGTGGGJGGC T« 
GTGGAGTGAC GACATTGGGG ATAACACCAG AGGCGTOTA TCICCGTTGG ACAAATTAn 
ATTATGGCTA TGAACATTCA ACATATAATT TAATTAGAAA TTTGCGGATG AAAAAGAGGT 
aUcmhg? kaaatggtt AAAAATATTA ACGTTGTACA GCAAATGATA ATAAAAAGTC 
ZcZl ATGGAAAAAT AATAATTTGG GTTAAAATAA A = 

TTTTAACTAT ATAGTACTTT TTGAGAAAAG ATAATATTAT GTGTATTTTT ATTGAAACAA 

SE£» — ; ^ ™ 

AfTTfTTCCT TCTTTTGTTG GGCCTGTGAC CCTTTTAGTT TTAGTCCACT TCGTTlluw 
^™ GTGACAAACC GACOGGAGCC AACCAAACCG GTTAACATO 

SEE CATATTTTAT TAAGTTTTGT GTTGATGCTA MCCAAAAAT CATTGG^ 

C^ATTTCTA aatttagtaa taaacaaaaa caotagaaa tcacacgttc actatactaa 
^aaaTaca acaaqatac taataatiaa agaagagaaa actgaaccaa 
aWgaa tttaaattag taattgaagt aagaagatga agaagaacat 

ZEES ACACTAAAAT CATATAAAAA TACATAATTA CAAAAGTAtt 
" ATTTATTGAT ATGGGTCATC TGTGAAACAA GCCACAGAGA GACAAAGACT 
SS G T GGGCAACGAA AG.ACaCC = AOGCCATTA A = 
rrrCTCCnC TTCTTCTACA TTTTATGACC GTTTTACCCT TCAAGAGAGA GAAACMAA1 

Scrcr ATcrcracr tctgcaaagc ttcagaactc 
a^aagatg atggggtttt taactttatc ctccccaaat AAncnm CDCTTCATCr 

rTCTCTCTTA CACAACAGGT CCCTACATTT GTACAATCTC CTCTCTTTAA AGACTCTCTC 
TCCATCTCTA TCTTACTCTG TATTTCTGTC GTCTGAGCAC TCAATGAAAC 
™ S££ ATTTGATGTG ATCGAAOGAT AAAAATCATT mTCTOGGT 
C ZZZ AAACAAAAAC AAATTTCTGT AGAAJTCATA A— GM^ 
rrAAirTrrr, TACATAATAC GGTTCTCTTC TTCTTCTCTA TCCTCTGTTT CTTCTTlAiw 
SSSSS GTATATCTAA AAACGCAGTC GTTTTAAGAC TGTGTGTGAG 
Zlaa SXaGAT ATTGTTCTTC TTGGGTTTCT CTTCTGCTTG AGCTTAGTAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



32 



CTACTGTGAC TTCAGAGGAG GGTCAGTTAT TATACTGATG CATGCTTCTT CAAGTTCAAG 1920 

attttcgtct rnTGTrnA tattagtgaa aaaaacttaa agatgagatt tttatatgat wso 

TTTTGAAGTT TCATTTGGTG AAAATGAGAT CTGQGTACTT GTTATT7TCT ATTTTTGCTT 2040 
TTTGTAATGG TTTTTTTTTA CTTGGTGGGT CTTCTATAGA ATCAAAAGAA GCTTTGAATA 2100 
AATTAGGGTT TGAGTTTTAT TTTGTTTTCT TGGAAGTTGA ATTTTTAATC TTCTCAAGAA 2160 
CTGACAAATA TTTTTTTTTG TTTTTGTGCG TGTGTGTTAA TAAAATATCC TTAAAACAAA 2220 
ATTAAAGGAG CAACGTTGCT GGAGATTAAG AAGTCATTCA AAGATGTGAA CAATGTTCTT 2280 
TATGACTGGA CAACTTCACC TTCTTCGGAT TATTGTGTCT GGAGAGGTGT GTCTTGTGAA 2340 
AATGTCACCT TCAATGTTCT TGCTOGTAA GTTTCTTCAT TCCTTTAGAT TACTATTACA 2400 
GTGGTTTTTG GTGTTCTTGT GGGAAAAAGT TGTAATTTGT TTTGTGTGTG TTTTCTATGT 2460 
TTTGTAGTAA TTTGTCAGAT TTGAATCTTG ATGGAGAAAT CTCACCTGCT ATTGGAGATC 2520 
TCAAGAGTCT CTTGTCAATG TAACTGTTTC AACATTCACT GTAGCATGAA ATAAAGTATC 2580 
TTACTTTAAT TCTATTCCAC TCTCTGAGTT GTGACTTTTG TCTTCTGTTT TTTTCTAATG 2640 
TAGTGATCTG CGAGGTAATC GCTTGTCTGG ACAAATCCCT GATGAGATTG GTGACTGTTC 2700 
TTCTTTGCAA AACTTGTAAG AACAGTGATT GGTGTTATTC TACCATTAAA CTrTTGnCA 2760 
TAGAGGnTT ATTTGATGAA GTGTGTTCAT GTTGTTTTTA ATTCAGAGAC T7ATCCTTCA 2820 
ATGAATTAAG TGGTGACATA CCGTTTTCGA TTTOGAAGTT GAAGCAACTT GAGCAGCTGT 2880 
AAGTAGCTAG TTATTCTGCT ACTAGTCTTC ATATGTCATT GCTAAAAATA TACTCACCAT 2940 
GTGGAATATG GATTTTTACT TTGTCCAGGA TTCTGAAGAA TAACCAATTG ATAGGACCGA 3000 
TO^AC ACTTTCACAG ATTCCAAACC TGAAAATTCT GTATGTTCCC CATGATTC7T 3060 
HZml SaCTTTTAG CTATATAGGT GATCATACAT GTCTAATTTC AATTGCAGGG 3120 
ACTTGECACA GAATAAACTC AGTGGTGAGA TACCAAGACT TATTTACTGG '^^^JT^ "J- 
TTCAGTATCT GTAAGTGTCA ATGTTTTTTG AAGTCTGTCA ATGTCTCTTC ATTACCOGGT 3240 
SSSI G7ACTATGAT GAGCAGTGGG TTGCGACEAA ACAACrrAGT ^AACAH 33* 
TCTCCAGATT TGTGTCAACT GACTGGTCTT TGGTATTTGT GAGTCTTCTT GCACATCTGA 3360 
S GAGTTCTTTT GTAAATATCA AATATCTGAC TTTGTTTTGA TAHGAATCA 3420 
TgSg AAACAACAGT TTGACTGGTA GTATACCTGA GACGATAGGA MTT^ 3480 
CCTTCCAGGT TTTGTATGTG CCTCTTTCrC TACTTCTAAA CATCATTACT GTAATTTGGG 3540 
TTACTTAAfiA AAATCTACTT AACTGGTTTG CTTATTACGA ^^^^ AttTTGTTAG Z 
ATCAGCTAAC TGGTGAGATC CCTTTTGACA TCGGCTTCCT GCAAGTTGCA AWTTGTTAG 3660 
TTCTCACCTC TACTAATCTT TTGCTTTAAA TTTTGGCTAG CCTTTGTTTT OTTTAAAGA 3720 
AGATCAHTT CTTATCTTAG ATCATTGCAA GGCAATCAAC TCTCTGGGAA GATTCCATCA 3780 

Sahggtc tcatgcaagc cotgcagtc ttgtaagtac rrncnaA atcaatga* mo 

CTACTTATAA CATTTTCATG AACTTAGGTT ATATGTTTTC TTTTAQJGAG TCTMGJOK 3900 

CAACTTGTTG AGTGGATCTA TTCCTCCGAT TCTCGGAAAT OTAOTTCA CCGAGAAATT 3960 

GTAATTCTTT ACCTGTTTGT TTTCAGTTTG GAGTCAAATG TCATACCATG TTAATGATAG 4020 

SaTCT miGGCTTT ATCTCrAGGT ATTTGCACAG TAACAAGCTG ACTGGTTCAA 4080 

TTCCACCTGA GCTTGGAAAC ATGTCAAAAC TCCATTACCT GTA7GACCAA CCTTCrCTTC 4140 

ACTTCTCTTT TTGCATACAG TCACTACTAA GTTGTGTTTC CTTATCAACT ATTTGTAAAA 4200 

TATTCATAGG GAACTCAATG ATAATCATCT CACGGGTCAT ATACCACCAG AGCTTGTCAA 4260 

GCTTACTGAC TTGTTTGATC TGTAAGTAGT TCTTCCTATG OTGACATGT T7TGATGTTC 4320 

T7ATGOTAT ATGAACTA7G TACATATAGG AATGTGGCCA ACAATGATCT GG^ACCT 4380 

ATACCTGATC ATCTGAGCTC TTGCACAAAT CT AAACAGCT TGTATGTATC TOTTCTCTG 4440 

AAAACTTCTC ACTTGAATGT TCAAGATTGG TGCTTTATAT GATTTTGTG7 CTCATTAATG 4500 

TAATGTAGAA ATGTTCATGG GAACAAGTT7 AGTGGCACTA TACCCCGAGC ATTTCAAAAG 4560 

C7AGAAAGTA TGACTTACCT GTAAGTATCG ACGCTGAGAA TTTCTCTAAT CTTATATAAT 4620 
ATATAG7TCC ACAGCGTTTG TTTTTTCGAA TTTCAAGTCA TTAACTACTG AGTTT7TGGT 4680 
TGCCTTTGAT TTATCGGTTC AACCAGTAAT CTGTCCAGCA ACAATATCAA AGGTCCAATC 4740 
CCGGTTGAGC TATCTCGTAT CGGTAACTTA GATACATTGT AAGTGTTTCT TGTTTTCTGT 4800 
GAAGTATACA TCATTATATG TGCCTTGTCT CACATTTATT AAATTTAATG ACATTTGAAG 4860 
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GGATCTTTCC AACAACAAGA TAAATGGAAT CATTCCTTCT TCCCTTGGTG ATTTGGAGCA 4920 

TCTTCTCAAG ATGTGAGCAT CCATAAGACC TCCAGTTTTA TTGTTTATTT CTAGCAAAAG 4980 

ATGAAAATGG TTCGTGAACT CTTGCATTCT TGTTATAGGA ACTTGAGTAG AAATCATATA 5040 

ACTGGTGTAG TTCCAGGCGA CTTTGGAAAT CTAAGAAGCA TCATGGAAAT GTAAGAAGTT 5100 

AACTTCTATC TGCTTGGTTA GAGTTTTTTT CATTTATCTC AATTACTGTT CTGAATTTGT 5160 

GTGTTTGTGG TTGCAGAGAT CTTTCAAATA ATGATATCTC TGGCCCAATT CCAGAAGAGC 5220 

TTAACCAATT ACAGAACATA ATTTTGCTGT AAGCAATCTT CCTCTTATCC OTCCAAGCr 5^0 

GTTAAGAAAT TGTTTTTGTA GAATGAAACT AAAACTCTGT ATACACAATA ATGAGGTCAC 5340 

TATAGTGTGA TCCAGGAACA TGTATTGGGT TGGTGATCTA TCTAATGTTG TGTTTCTTAA 5400 

AATOmGC AGGAGACTGG AAAATAATAA CCTGACTGGT AATCTTGCTT CATTAGCCAA 5460 

CTCACTGTAT TGTAAGTAGG CACCTTTGGT TCTGAAACAT TnTTGTCtt 55£ 

TCT7TGTGCA TCTTTTGCTA AGAATATAAC CCTGCAATCT TCACTAACTC ™TAGGAAT 5580 

GTaSa ACAACCTCGT AGGTGATATC CaAAGAACA ATAACTTCTC AAGATmiA 5640 

CCAGACAGGT ATGGTAATTT AGCAGGTTTT GGTATTGTGC ATTTTGTTTT GTTTGCTAAT 5700 
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« CT caatc ia««* a«Ttcn« A«r«aAn «r« i^orrcT ^ 
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CATGGTCTTA ATAGCAGCTT GCOGACCGCA TAATCCTCCT OCTTnClTG ATGGATCACT ^ 

===™== « 



TGACAAACCA GGTCTACTCT CCAAACCACT TTACGMTGT TCTTCACCTA CMTGTAATC ^ 
GAAGAATTGT AAACCGGTTG CGMTMW* hv.. . . „■ 

^..^CC^a^T^^C^a 6360 



rAMATTTAA TCCTTAAATT TCCTGGTGAC ATCAGTAACT TATTCGACAC CGAAGCTCGT 
TCTA«TCAC AAGTATATCA TTOGCAOGG AGCAICAAGC KTOffJCA AATOOTm 6240 



= = =» ~t = = 



£rc£»C n«ACn« GTGATGGGCA CGATAGGTTA CATAGACCCC GAGTATGCTC 6780 
SSSoK AMTCCGATG TCIACAOTTA TtjGAATAUTC cnOTOACT 6840 

= = ~ = ,» 



EES ™ — ™ = = 6*0 

TTrArrTACA TCAGATAATG TCAAAGAOGG GGAACAATGA AGTGATGGAA ATGGCAGATC 7020 

SS AtSS AAAGATCTCG GTGTGGTGAA GAAAGTTTTC CAACTGGCAC 7080 

tSSaC SXaCAG CCGAATGATC GACCCACAAT GCACCtfGTG ACTCGTGTTC 7140 

TCKOCTTT TATGCTATCG GAACAACCAC CTGCTGCGAC TGACAGGTCA GCGAOGCTGG 7200 

cTgGTTOTG CTACGTCGAT GAGTATGCAA ATCTCAAGAC TCCTCATTCT GTCAATTGCT 7260 

mcSS TGCTTCTGAT GCTCAACTGT TTCTTCGGTT TGGACAAGTT ATTTCTCAGA 7320 

ACAGTGAGTA GTTTTTCGTT AGGAGGAGAA TCTTTAAAAC GGTATCTTTT CGTTGCGTTA 7380 

aSSI ^ATTAATG TCTCATGTAA AGTATTATGC AOGCCTTAT TATTATTAGA 7440 

StL GTGTGAATAT GTCTTCAGAC TGGCACTTAG ACTTCCTATA ACTTmGCc 7500 

TATCTAAGTT TTTCTAAATT GGGTTATTCT TGTAACATAT CTTAGATCTA GTACTCAACA 7560 

„ ACCACAAAAG ATTTCTTATG CTCAAAAACA TATACATAGA AAGAACCTTC 7620 

S^A GAAACGTTTT GCTATGTAGT GTTATATGTC AACCACGTCT ATGAGAGTGC 7680 
aTcS S TiaCACnG GCAATAAAAA TCATAAACaA ATATATTGTC 7740 
TGATTAATT7 GTTTTTTTAT AATTTCTTAT AT7AATTCGA ACTCATACAG 7800 
S GTC Tc nTCTAGTTT AGTATAAAGT ACGTATTTTT GCAAAATCAA AATCGTAAAT 7860 
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ACATACATTT TAAAATGTTA AAAAAGATAA ATCCGTACAC CATTTAAAAA TGGCATTTTC 7920 

CTAAGATTTT TTTCAAAAM GGCATTTTAG ACAAGAACTA ATTACTACAA CTAAAATCTA 7980 

CTAACTTTGG TTTTTATGTA TACATTTACG AGAGTCTACA CAAAAAAAAT ACATAAAAGA 8040 

AGAAGTAGTA AATAATTAAA ACGTAAAAAA AAAGACTTTT CAAGAAGGCA GAAGAGTAGC 00 

ACTGTTGTGC GATTGTAAAA TCGTCTTGAT TGTTGTTTAT CCCACTGATA AGCCTACCa 8160 

TTTCAAAACT TGTTCTAAGT TTAAATTCTA TTTTTGAACA TGACATACAG TATAAGGCTT 8220 

TTTAAAGATA TCATCTTGAT TTTGTTTCTT CCACAGGGAA GCCCTATCCT TTCTTACATA 8280 

ATCTTTGTTA GATAATTTTT TATTATTTTC AAAAAAAATA AAATTGAACA TAAGTTTTCT 8340 

cIaACTAATA TGTTQAACA ATAATAAACA TAATATCATT TTTTTGTTTT AAACTATAM 8400 

Sat ggtaaaaagt tgcaatatat aaatcataat haaactaaa aatta^mt m» 

TGGTAACT7T TTCTTCAACA ACATGCCACA TTCGGCTACA TGTCCACTAG GAAGTGTTAT 8520 
TATAGAATCG TTAATtiTTGG GTACGCTTAT GAAATTATCA ATGTTTCCTT amtctatgc 
TTAGAAAATT ACCAATATTA CCTTAAAACT ATATTTACGA ATGACCAATA TTCCTTAGAA 
CTATGCTTAT GAAATTACCA ATATTTTCTT AAAACTTAAA CACAAAACTC TTTAACAAAA 
MMCTTTAT TTTTATTTTT ATTTTTTTGG CAAAAAAAAA AACTTTATTT ATAAAGTGM 
OTCTCCAGA TAATTTTGAA TTTCATTTTT CCAGTTTTTA TTTAGAATAA Ttmcnra 
TTTACAAAAT AAAAGAAAAC CCTAGGGTTT AGGGTTTAGG GTTTAGGAAA AAGCGATGAT 
ATATTAATTG TTATGAAATG TTTTTnAAA AATAGTTAAC CAAACATTTT TTTAAAGAGA 8940 
GTTTAGTTTC ACAAKCATT TGTAAATTAG AGTAAT7ATC AATAAAAATG GAAGACAATC 9000 
TAATTATTAT TTAGCAAAAA CTATATTTAG GAAAATTAGT TAAAGTTTAG AAATATATCA 
StS AAACTAATTA AAATTATTTA ATTTTGTGAT ATACGTCATC ATATAATTn 
ATCAATATTT AATATTATGA TACATGTAAC TCACTAAACC TAAATTTAGA AGAAAACTCA 
AAATAATCAT AACCAATTTA GATTCAACTT CTACTTTTGT TCCAAGAAAA AAACACATGG 
TTTGTTTTGT GGGATACTAA TGACATCTAT CAAAATCTAT GAAACCAAAT CTAGA 
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Sequence Number: 3 
Sequence length: 18 
Sequence type: Nucleic acid 
Number of chains: One 
Topology: Straight chain 

Class of sequence: Other nucleic acid Synthetic DNA 

Antisense : No 

Sequence 

TATCTAAAAA CGCAGTCG 18 



Sequence Number: 4 
Sequence length: 18 
Sequence type: Nucleic acid 
15 Number of chains: One 

Topology: Straight chain 

Class of sequence: Other nucleic acid Synthetic DNA 



Antisense: Yes 
Sequence 

AAGATTCTCC TCCTAACG 
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Claims 

Claim 1 DNA that codes for a protein that includes an 
amino acid sequence that possesses the action of 
controlling plant morphogenesis that possesses the amxno 
acid sequence shown in Sequence Number 1 or alternatively 
that amino acid sequence in which one or a plurality of 
amino acid residues that do not affect the action of 
controlling plant morphogenesis are substituted, excised 
or inserted. 

ant- i sense RNA that controls 
Claim 2 DNA that codes for antisense k« 

the expression of the DNA of Claim 1. 

Claim 3 The DNA of Claim 2 further characterized by 
possessing a substantively complementary base sequence » 
at least one portion of the base sequence in Sequence 
Number 1. 

Claim 4 Plants whose morphology has been transformed by 
the DNA of Claim 1. 

Claim 5 Plants whose morphology has been transformed by 
the DNA of Claim 2 or Claim 3. 

Claim 6 DNA that possesses at least portion of the 
sequences represented by Base Numbers 1 to 1752 xn 
Sequence Number 2 and that controls the expression of the 
DNA of Claim 1- 
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simplified descriptl™ of the drawinga 

Figure 1 is a restriction enzyme map for the morphogene 

clone acquired through the practical embodiment. 

Figure 2 is a drawing showing the T-DNA insertion site xn 

the chromosomal DNA fragment inferred from the 

restriction enzyme map and Southern Analysis. 

Figure 3 is a drawing showing the positions of the 

subclones of the sequences adjacent to the T-DNA 

insertion part 
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Abstract 



Objective To obtain a gene that controls the 

5 morphogenesis of plants, and to employ such gene in order 
to control the morphology of plants 

Means employed A gene that codes for a protein that 
includes an amino acid sequence that possesses the amino 
acid sequence shown in Sequence Number 1 that possesses 
the action of controlling plant morphogenesis or 
alternatively the sequence in which one or a plurality of 
amino acid residues that do not affect the action of 
controlling plant morphogenesis are substituted, deleted 
or inserted, or alternatively the transformation of the 
morphology of plants by means of DNA that expresses 
antisense RNA in such gene. 
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EXHIBIT D 



<19)B*S»MT/r (J P) 



«2> & ft & 2t & (a) (hmmnumwhn* 

^5B¥9~ 56382 

(43)^Me ¥JS9¥(1997)3/!4B 



<51)lntCl.» *WI£*t Fl 8*J3fc*«Bf 

C12N 15/09 ZNA 9162-4B C12N 15/00 ZNAA 

A0 1H 5/00 ZNA A0 1H 5/00 ZNAA 

C12N 5/10 C12N 5/00 C 
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Applicant: Josette Masle el al. 
Serial No.: 1 0/5 J 9, 135 
Filed: August 15, 2005 
Exhibit D 
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^^9-56 3 82 



i m*m i ] mvncv&m&at&fflm- * «§« * * l . 
mm^- 1 t,z*-tr s j mmqxuzcoT s s m&mz 

lXJi2JaJiOT5yi3£2coB8t. X*£>&U±fiiA 
NA. 

ryft^RNAja-ftl.DNA. 

[ it^ja 3 ] 1 1 < t *> 

CODN A„ 

mm, 

[ ts^n 5 1 ts^ii 2&wt3 tm<o D N A TimSe 

DNAtint. ffi?i|##2WJgS#-^l~l 7 5 2T 
lS£ftl>ffi?iJ<0'>=SK t w-ra D N A„ 

[000 1] 

y^RNAjn — HfSDNA. MtftdiX^CODN A 
[00 0 2] 

-ttcJ;OWfflf B acoff^Sr^.|»^r&* < J>.&« { . Sift 

[ o o o 3 1 itz. ^^sm^m^m^mmm^miz 
£*oj&Bizm®£5-i.z>mK?i,z'£mmz*). zcom 

^tzmw^mz^-th z t imisizmmx-fr h . 

[0004] ZZZW'faco&f&BlS.ZmWtZmteTS: 

<m.-t&ztti i x-zft.\$. mmk^zn-thrv^v 

XRNA£ftM-f&£ol<zmfr&&tz'<7-7- (T>+ 

[0005] i^o 4 (T5t'HTv-X -IC'jr 

-J- (Arabidopsis thaliana)) (i. SjSSSH (vegetaive s 
tage) #»4>Pf£J8i (reproductive stage) ^coffl&lzft 
*7)#^SrffV\ ^>£j£&V'*ft(;:ia<S0lil (intern 



ode) <ntfifk{m<mmLx*>o. mxtzMffrmt>ti 

it^rfi^IsC (highly ordered branching pattern) Stt^ 

i/o-f Jf-r-X-fW^ieW^xn^^r (ecotype) 
i LT£]<=>ftT^&Landsberg erecta ( yyX/<—^ ■ 
mi. nW&X-V?? (erecta: *>Wt) X 

m^ntxao. mtc&immmz^-t. mmzw 
mteth^vwYtci&tiimm-h. stmt. &w$t 

im (pleiotropic) X'h*) , Rfl^fSfc^^FS^ 
(silique) £3rf& (JjLh, Huang. I. et al., (199 
1) UCLA*- Z.b-yisy#iSVM.X'miZtltz& 
g) . 

[0006] Landsberg erectatt^O^St H— ii 

&Fffi<ox»e«r & z t vmm^mzmcnx ^z*> 
^KizmmmftstLxtso. er-1011*, er-vm&v 

er-103ffifcifo£$ftT V>6 . 
[0007] 

[f&BJW^&Lidfc^&ISRjg] .kiSLfcJ: dfc:, ttW 
[0008] ±ie«j£*^&$ft*:i<0T'*> 

usst-r*. mmm^xutm&Ti.zn-thTy 

f-fe R N A £ 3- K-f S D N AX-m^^BWkWk 

tm^m^i-hz t t>mmt ltv^s. 

[0009] 

te. ^mm^i^a^ 5^x^coi^fe«cDNA*^fi 

iit^^ ^- d n Ax-mmm Ltzwwmn&mimiL 
-tzzttfL^KL. ^m^m.-t^zm.^tz. 
[ooio] t-^^.*>*%Bfl{i. nmmmms&t/m 

^S?S1t$rWL. ffi?"i#^i^-rT5yi?iE?'JX{i^ 

Mi 3- r-'-TS D N AX'tbZ . 
[00111 tttftWii. ±tecODNA<7)^SrW©J 
t57yf-feyXRNA£3-Kt6DNASMt 
>lcODNAi:LT«±. 1 siMnt&mm "Jco 

[00 12] *JHWiS<s»fc. mttV'WtL&n-Xi 

■t&DNAX'&mmzixtzimi*. msmtiTv** 

VXRNA5:3-Ht|,D N A X'&mmz tltzWmK 
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[00 13] *?6Bflir)DNA{CJ: V^-FZtlh? 

<. mvtcDBm&fcwm. mzmcowmzmhzfy 

i^QXh 0 . Z <r>-9 y/^nzn- F-tZ&fcf-X'W. 

[00 1 4] — 2f. frieryf-tyXRNASrn-K-f 
z f-<r>fttt : k$m\-i-& D N AX'WtoZ Bm&mt&Zt \z 

[00 1 53 S, ^ifcfc^-c, r Mf*DNAj 
SDN A&VZCODN A^ztt-tZMfc^Z^d . £ 

tz. *&mzj:om&ztL2>mm<oBmBtfL<?>fflmzm 

[00 16] 

iftwconmcommi jar. ^mco^m^Bmimm 
-r&. *^wiif5^ii. wi-tf. Bm&ti&fflwizffl 
-r^mi^hmam^. zcnmmBmzmzmfr 
h$£mm*r¥-*MiLMi-z,zt izi. nwk-rzztwx'* 
h. suz. ftbixfcgmm&^t&mmmizm^x 

fm Ltz* 'J dy ? H ^ro-/t -fh>\4 y 'J 

yj e-is 3 y . hh \^mmms^-cr,^mm^zm^ 

«PCR (tK'J.*?— tf - y ■ >JT7isay) 

(=J:0. ff±Sfift<OS^f*DNA*^?f±SitfET£ 
m^hZti)-X'^h. 

[ooi7] BmBtftnmmizmrt&Rmztt&mm 

Rv*ftwni&m^cr>mmmzmm<.zmM-z>. ft. d 
na^k. >it£. Bm.mm. mz.T<F>t&mm\\<r>& 

yj y-i -^-^ 3 ym-^tcomBrmmt^ 
m^mu. &mmz8>mT?&iti9Kcr>&mmi l zWitt2ti 

TV^SJB^^, Molecular cloning (Maniatis T. et 
aJL Cold Spring HarborLaboratory Press)Hf£K^iX 

[ o o 1 8 ] < i >mvK7)BmB®.&fflw-?z>mte?cr> 
mat - 

( i ) BBB&<omwizffl?z&n£iiji-zimt*coft 
m 

mm. mz-ifi/nj z+x+<mfm\&z%mt&&m. 

AU ifefe(*tDNAtctfA$-(i-€.C:i:{Cj:-5T. f$A25 

^A^ffl^&^-a;-^, <i^rar-7-5Xr-ajfla(^i-s 



[00 19] CCT'ti. /<-f ■f-y-*?*-* (flftttl 

sfct^aBBB^s^fflco^-^-jte^^^o^^ y- 
%) zm\^tzT?uj<9TV^i±mi&mz£r,x*\Jkm. 
^zmmwmzmxL. BmBtit&MW-tz&GrFiz 

[0020] yu4 Z-fX-t (Arabidopsis thaliana) 

? (in planta) T^oA^-f'J ^AJSSfcffi (Chang S. 
S.. Park S.K. et al . Plant J. 5, No.4(1994)) Sri' 

-yy^tsm^imizmmL. ±w 
>w yv-?4 yyfflit£jn?B , m®m.m£v 

•y i? ">-;K;^tt LX W(£ L . 2afl4M&B#IF££tt 
SUCJt^T (erecta^gflc) ^S^KJtcS 

[ o o 2 1 ] z o Lxn^ix^mmt. mrnvmrnm 
kizx^xgmLx^&simttfm^. 

[0022] ( 2 ) Bm3gL&?<0&m&{xr?e>%.m. 
±%&r>£ o IZ LXfte>tLZ,BBcr>3£itl f.z$£m&frt>. 

*cD$£mzm*3&$znmmT& . ^m^snu 

^>Cell. 35 (1983) p.35i2»<0;frffi^TSfefeflcDNA£- 
A*, %&mzftAZtit:'<-<( t'J^i'-mtv^ 

[002 3] ±ES«DNAT^a^®!!S^L, V 

NASriaiR-r&CtlCioT. /<>f •r>)-^7?-t& 
IzBm&WM&T&^timeitoD N ABrM-^t#-i. i t«< 

[0024 3 35S«1%l«!*»feSSfeft:DNASr 
^•TI»Ct(cJ:0^fef*:5^r7U-^f^l!^. Z<0 

y^m^t^zt^zx^xL. ^fmfei-WK&^-th 

■rv-yzwsi-thzttfX'Zb. 

[0025] ( 3 ) Bmimm&^m^mm&f-iom 



}#H¥9-56382 



m 

fcfcJ: oTt>. *WfPMBrf-*Jm* hZb #X' $ 
[0026] iMimmmx-m^ixtz^a^ jd-Xi-com 

mmmmr asts d n Am^^&mm , iwisik 

?"J#^ 2 tc^-f . £ <7)iifE^ li . 2 7 i«i 9 V y < ex 
on) fc 2 6fI<7M yhoy (intron) £-£vCT-l^&. 
[0027] _kffi<OJ: 5 is, ##5^»£TO£«<M 

&m&f£.ZfflW-tZ>?>J<7m£3-V : -?&DNA£'i% 

cDNA^^f 7*7 y — ~>aj z-t-x-t-cot&h&tm 

A^mRNASrttajL. itlg^^^fflV^T c D N A£ 

0- fiSlL-, .1?LM5 — tfRJCE(c«toT2*||'ftL^tcoSr 

tmt&ZbtfX'th. cDNA7\3--yy*-,VW 
1{iUZ1xX^h<r>X'Ztlt>i:timi,Xi*£\.\ ft&tltz 
cDNAyJ7y<J-frC>. ^Wm^^Xu-Xb 

ixmmimmm? eDNA^D-y^is. 
[0028] ±ie<oj: o iz Lxmmmmx-nhtitzc 
DNAcossie?ij, x.vze>t&msim*t>imzit&T 
mmmm^uz^. zcdm&?<o 

Smm<M±. Nature. 345 (1990) p .743t,ZfmZtlX 
^6RLK5fcffl|5|1±2-7FL*:. ZWRLKfflx?^ Wfm 
ICSStSlz-fer^-Swrnf-f \£b LX 

<r>w&k*9-v\±. mB&tic.co®m<,zm-tzm&?tiz 

izimmim*'* tzmm^xh^ t #i £>*x& . 
t o o 2 9 ] < 2 >Bm$w&fci L <nmm 

*<cw*>4itg* x-h o . z nmmrTwmz&fimto 
u mmm^coftmmzimuZithztizX'ix . go 

[0030] *&w<7)&&? zm^xwtoz&'gsmt 

Zlzii. xi/?hD,fl/-y 3 y (mss.W???U£) hh 

mt/:£<r>lTmiz£r>X. Tuhr^XMCDNA^A 

1- <xtf«tv>. -f-coBg. 3|s^<0itgT b LXlt, Sfefift 
ae^-Srffl^Tt «kV^L. mRNA^SLtcDN 



[003 1 ] *f6W«5ie : F«)«3K*«W!-r6r 

nSraRN AO^^Xti^cO^ < t t-^t^ffl^W^ 
ffi^l^tSRNA, ^^-TSDNAT-tttlSr^te 

z t tfx-z zzt trnw ztih . 

[0032] 7*>^~teXXRNA£?&gi-f&DNAJi. 
TVfHryxIS (-tyxg((n-Hg[)lcfflffl^JgSE 

<r>Tmzm&-r&zti,z£*y&c>tih. m^mttm, 
-?-v>Tmzm£t2>ztt,z£ont>ixz. m. tv 

f-tyxglt. ^fift(DNAj,I.V^cDNAA^#fb 
ft SSfefltDNASrffl^S^Cli'f >Vuy\$*. 

ry^-byxm<r>^<tt-^tLxii. 3-K 
iHNL 5' #HRMtKXtt3' *WMWS<0V^ttt 
fflU#*>. ££>{c. ryf-by^RNA$-3-Kti.D 

NA<±. 3' ^StKISigfcrjDX. . mRNA«3' ttftlZ 
fttSftS 'J ASftcfflMW^r^ y d U ?r 3- h"f Sffi 

[0 0 3 3] *^CifflLH5rnt-^-tU 
ti. CaMV 3 5 Srot-^-^'#(f htlh . T 

y f-t y x r n a £ a - v -t h d n a xwfa £ m warn 

[0034 3 5 i^mcnmiFf-n^- H^O± 
6. Zcomt&kLXli. SMff2<7)SSS^l-17 

5 2X'm^tt^un<o^t£<th-^^tsm^ xo 
Mfcmzizmmm^3 96-175 2cr>mmtmft>ti 
h. znmmt. wmfflmz&^xmfcTomifflmz 
nm-tz, z t *<t-# hzb wmftztiz, . 

[0035] 

izmpnth. 

( 1 ) mmm^mmizm-t^mmi.^icoim 

UUA Z-t-X-t- (77t'K7^) X 3 * 4 7"WS ( Was 
silewskija. LEHLE SEEDS ttiO^A) &IZ. TiXyX 

F^T-g»S»"rfi6) •CJ>§pGDW32Sr^>r^o^'^7 : -U 
*7.&.EHA10H* (AgrobacteriuiB tumefacience) Sr. y 
Xyyf (in planta) r^DyN'^^y -7A!gtJ&£ (Cha 
ng S.S.,Park S.K. et al. Plant J. 5, No.4(1994)) 
IZX *)Sm^itfz. 

[0036] Mtomzu. arcox ? tc uffot. pg 
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Z^mWLl Ou 1 Wambo r g' s B5^« 
(y3l2%5:^ P H5. 5) 19 0^1«|? 
(1/20) Lfc. »tfrfel9-2 3B«, 

[00 37] ^^/itf*>0O^HI*-f co»»(c*5^ 
Tl l^^*fflv^TW0JR0. 26Gxi/2t« 

iiAL£, «I<7)I§L ffSJg^3 00 0 — 4 0 0 0;P7^(C 

/Co 

[00 38]#^:1^, lOjug/mlWW /07>f 
^ySr-irtfB5«^ifi(zfl|«L, fflJ£22°C\ SS30-40 

mraco^«iT-±WS*/v:. «**iil/1000^*3RL 
fcHYPOIteX («±«jj* (ft) «) «rfflV^. A^nv 
4 v >- WttSr*'fpGDW320T-DNA»Affi^ISr^O^WiE 
flMBUjSrP'/^^^Kc^ttL-CTftiEL. T4g^- (IM 

f#/v!o 

[0039] ( 2 ) ^H&fE^OJUiSt 
T^b*Kr^<7)yyADNA^. Cell, 35 (1983) p.35 

[0040] -80°CT-aK&L*:T ? t'h'ryXfi« 
(«) 5g Sr. ?L#*fflv^T«f*»*T««*(=«c6 
*TUH&U <Ift£25inl CO DNA *|ffl^7r- (50 
mM Tris-HCl.pH7.5, 0.2M NaCl , 20mM EDTA-Na2, 2% N- 
y^uj ^f;^i/yth'Jr)AI, 3g/ml 5% T 

Ef&ft7xy-;l^) £flD*.TfiBf U 25ml C7)7xy-^ 
/?na*^A£flO;t£?lL 1.5ml C010% SDS ( Ff^ 

ISKth'J^) *SPiT104HH. aa-cii^fc:* 

ffL*:, ,Ift*6000prn TlOftSK^fcft, *«^25ml 
^^xy-^/^on^ASrjD^/v!^, SJg6000prm 
T10»a*Ufc. CO*/flC15mlOX?y~/P£flD;t. 
MB¥L?tf*a:^(c, 6000prm ±}ff*» 

itiatC25nlO70^^y-^S:an^T, #;Hr/7 
XfcJ: U 6000pm. T104H8'DLfcflLh»«:* 
T, *ETt«aiU:. Clfl£400/zlcOTE,pH8.0 (10/i 
g/Acl RNase) tCjg»Lfc. 

[004 1 ] ±I£co j; 5 £ LT , ^JB^fiKcOflJWtcWi- 
SS»**^iH«L^ 200 ug <7>£rfe<*DNA M^X 

csci m&'bmtzxnnmLti. zcoo%. co 

DNA £ EcoRIXttXbalTGJSrU 7x/-/l/« 
^il/c^DNAT'^:^ (XLl-Blue MRF' (STRATAGEN 



Ett^^IA) ) fc#»E»L;t. ^tSJU. 1*1000=7 

p^wyty v ym&z^-tm'mimzn & z t 

[0042] ±MZ<r>£ o iz IX l^X* ^-^ft/cT^X 
5 Wo-h. EcoRlT'WKL/:DNA^^#^ix/:^^5:pR 

EaSX^pREb. XbalT^Br LfcDNA*>£» hfltz t OStpRX 

a&l^pRXbfc^L/v!. 
[0043] ( 3 ) SMg^<o*J! 

±IE<o J; 3 lz tx l/x^fa-^n^x $ 

A ^4 ^9 U-;frfccDX»a^o#*£fTofc. P 1 

7r-yX;^- (DU PONT&a^lRAL*:) 

t\ The Plant Journal, 7 (1995) p. 351^10550^ 

ra^DNA^/^y-ssKLfc. ±ibpr 

Xb fc pREa^ -eil-filSW! L^EcoRI-XbalBr^ 3 2 P M 

f#<oft. -Hl«i28D7, 61H10fcffc&Lfc. Cil£>0? 
Q->-c7)©JRRS^«ya^lS*tTV\ 28D7teifr25kb. 61 
H10te#T75kbcor 7 b* KT^XlfefettD N A*#*3»A 

mxizft£ti&&mmm?<7) j Q-7'?n-->7i:mmL 
tz. zcom. j-vmzmm^&ffiiinfflmwmmmt . 
mxw\K<o*r4f>tmzmy& t . »ABrfr±coT-DNA» 

AgPfi5r«^L/s: (H2) . i^#A»fflOififlW)!fee 
<*DNAie^J$:Bluescript II SK+(C^7'^n--y>? r 
L/C* f#^^/^r^n--yc7)ffABrfr±^fira^[l3 

[0045] ( 4 ) »MW*<0M!Bfc^4ae^W)cD 
NA^o— yc7)#SI 

W^mR N A SrPSS L > Molecular cloning (Sambroo 
k, J., Fritsch, E. F. and Maniatis. T. (1989). A L 
aboraroty Manual, second edition, Cold Spring Harb 
or LaboratoryPress, Cold Spring Harbor, NY) <7yf5 ; fc 
£J:9cDNA£!$|$!U AYES (CLONTECHtt^^^A 

tz. --ft. B9iepRXb^pREa^i?>T-DNA^K»-r§S^JS: 

wosu. ztiz 32 pmmttz. ztihtru-ytL 

X. yr-V'yA ^7 , J-C7)300,000T^— 7£Mmt 
LXT7-7'^7Vy>H£-is3y£ft*,tz. ZoL 
xmtRLtz7°yA* H^pKUT 1 6 1 1 Ltz. 

[0046] i5)»mtA^mtzmthmsrF^m 

pKUTl 6 ll,Zl££tl&cDNA7n--yt%&fo& 
ttn&g&ffli&fe Ltz. c D NA<WB£«ffllRVZ 
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?'l€rffi?ij#^2tc^-r. 

[ 0 0 4 7 J cDNAcOJSSie?fW^«l^$fL^r 5 J ®& 

n^mm^^tztz^. z com&^mmmm . Na 

ture, 345 (1990) p .743KKS5£*lTV>.&RLK5fcffi|3) 

*r^-i«rnf^ \ft Lxm&Zixfztf, 

5riT«^< ^T'J) & . C CORLISSES OHH^? - > 
Ut, »JH»KE<0WWfcBW*Jte : ! t i:«»5r'j, ifi±3* 

[0048] ( 6 ) «nt»K(0«J»(:n-r&XIIMc(Ol)| 
Of 

ME ( 1 ) T#^il7t}g®^«W©J»tcRrr«.S:S«: 
»i, ^<7)|*^«®*»f>. Landsberg erectaflOTXU? ^ 
33* t ISI-ilg^acO^T-* & fc # £ . fit 
i"f , 5fc<OLandsberg erectaft. er-103fl=h*^T# 

tih^mi-t^xm-^tm^m^mizmmi-i z t 

WWU «tL*:3EI***ei-104R4: US. 
[0049] Landsberg erectaflc, er-103^*->^m RN 
Afc*KU RT-PCR <i£*E¥- P C R ) aj-Clt|i|g 
U:. •t ; 5:*>*>. mRNAZmmt U Id?ij#-v4tc^ 

LTiS»K¥Rja5S:^T^. «^Tf*^>flfc c D N AZ®m 

ffi?'J#^ 3 t^-rSSfieyiJ £ h * U rf ^ ? U it * V 
£5' MTy-^ v-KJflVvePCRRJEfcfr-?*:. 
[0050] ff6ii£ifM&«|*a£fc L, JJW y rf 

iaaE^IS-^SL/i. -f^m. Landsberg 
erecta«T-iill?lJ§-^ 1 <7>JSS*-Sf2299§<DTA<Alca 

«a $ y ^yaacaHiLT^fe. er-io3i*-c-{iie?i|j!iffi 

1 ?>igS#-5y896#OG#A (Cg^-tl. d t (c J: 

[oo5ij ztic,-£xco2zmw-fflx\ mmmm^um 
ittuzmz^mz&^z ttfmwi. 



[0052] ( 7 > f&mmwm&7-<r>%jmvi 
±Mn mizLxmrns ix/ccDNA , Mtx izm&.t*mm? 
tfBm&f&zfflm?&MmTX'$> hzt zmz? h 1 0 
—onumttx. %mtmm<7)&tmx'nmmft 
Rxsm^mmt&gimmcr)fflX'(r>Mm?cr,mi 

«?)3Mt£. y— fyfttfilzj:*) ttmttfLtz. 7o- 
/CUcDNA^D-yiffll^. Landsberg erecta3! 

[ 0 0 5 3 ] ttz, m^mwmco^mmx'(r>mmmmm 
mw&R&fWi tfintzmm* zzmftx-mmcosnu)* 

ML^m, K%yxcomi3Ltf2<. Z<r)Ztfrt>i>Z 

<Dmix^tfm<ottmzm£>hm&?x'$>zz t vvmz 

tifz. 

[0054] ? t>iz. ±au: j o tz. jmmmm&f- 
co&mte®fflkv : &TMftmw)X't) h <ox- . * aijioste 
Tommmmmt. mmzMiztmrnttsmcnm 

?m*jr2<7)mmm^ 1-1752 x-mzti&wm<?yj?% 
<ti>-&zists®.®. £omfcmziimm&^3 9 6 

- 1 7 5 2 WS^A^ff /?>^l> . 
[0055] 

hmm^m^^tih. zcr>m&?<omi3.£m*rt-z 
tizx-ix. mcDW&zimx'Z&zttfmzztih. 
£tz. zn&Grf-iztt-fhTy+^yxRKAitm-t 
h D N A B£?i)TiB flj 5r J&tH$Etfk't' h Z t tz± 0 , ^Wff 

[0056] 

[ffifimj 

E9«S#: 1 
E?lJC>££ : 3176 

mcom ■. -*m 
bvu-y- : mm®. 

WMOWm. : cDNA to mRNA 



^.tt* : yD'f (Arabidopsis thaliana) 

: navh'T 

^fasra-rie^: cds 

1??E(!iS: 51.. 2978 



CTTTTAAAGT ATATCTAAAA ACGCAGTCGT TTTAAGACTG TGTGTGAGAA ATG GCT 56 

Net Ala 
1 

CTG TTT AGA GAT ATT GTT CTT CTT GGG TTT CTC HC TGC TTG AGC TTA 104 
Leu Phe Arg Asp lie Val Leu Leu Gly Phe Leu Phe Cys Leu Ser Leu 
5 10 15 



(7) WHPF9-56 3 82 

GTA GCT ACT GTG ACT TCA GAG GAG GGA GCA ACG TTG CTG GAG ATT AAG 152 
Val Ala Thr Val Thr Ser Glu Glu Gly Ala Thr Leu Leu Glu He Lys 

20 25 30 

AAG TCA TTC AAA GAT GTG AAC AAT GTT CTT TAT GAC TGG ACA ACT TCA 200 
Lys Ser Phe Lys Asp Val Asn Asn Val Leu Tyr Asp Trp Thr Thr Ser 
35 40 45 50 

CCT TCT TCG GAT TAT TGT GTC TGG AGA GGT GTG TCT TGT GAA AAT GTC 248 
Pro Ser Ser Asp Tyr Cys Val Trp Arg Gly Val Ser Cys Glu Asn Val 

55 60 65 

ACC TTC AAT GTT GTT GCT CTT AAT TTG TCA GAT TTG AAT CTT GAT GGA 2% 
Thr Phe Asn Val Val Ala Leu Asn Leu Ser Asp Leu Asn Leu Asp Gly 

70 75 80 

GAA ATC TCA CCT GCT ATT GGA GAT CTC AAG AGT CTC TTG TCA ATT GAT 344 
Glu lie Ser Pro Ala He Gly Asp Leu Lys Ser Leu Leu Ser He Asp 

85 90 95 

CTG CGA GGT AAT CGC TTG TCT GGA CAA ATC CCT GAT GAG ATT GGT GAC 392 
Leu Arg Gly Asn Arg Leu Ser Gly Gin He Pro Asp Glu lie Gly Asp 

100 105 110 

TGT TCT TCT TTG CAA AAC TTA GAC TTA TCC TTC AAT GAA TTA AGT GGT 440 
Cys Ser Ser Leu Gin Asn Leu Asp Leu Ser Phe Asn Glu Leu Ser Gly 
115 120 125 130 

GAC ATA CCG TTT TCG ATT TCG AAG TTG AAG CAA CTT GAG CAG CTG ATT 488 
Asp He Pro Phe Ser He Ser Lys Leu Lys Gin Leu Glu Gin Leu He 

135 140 145 

CTG AAG AAT AAC CAA TTG ATA GGA CCG ATC CCT TCA ACA CTT TCA CAG 536 
Leu Lys Asn Asn Gin Leu He Gly Pro He Pro Ser Thr Leu Ser Gin 

150 155 160 

ATT CCA AAC CTG AAA ATT CTG GAC TTG GCA CAG AAT AAA CTC AGT GGT 584 
He Pro Asn Leu Lys He Leu Asp Leu Ala Gin Asn Lys Leu Ser Gly 

165 170 175 

GAG ATA CCA AGA CTT ATT TAC TGG AAT GAA GTT CTT CAG TAT CTT GGG 632 
Glu He Pro Arg Leu He Tyr Trp Asn Glu Val Leu Gin Tyr Leu Gly 

180 185 190 

TTG CGA GGA AAC AAC TTA GTC GGT AAC ATT TCT CCA GAT TTG TGT CAA 680 
Leu Arg Gly Asn Asn Leu Val Gly Asn He Ser Pro Asp Leu Cys Gin 
195 200 205 210 

CTG ACT GGT CTT TGG TAT TTT GAC GTA AGA AAC AAC AGT TTG ACT GGT 728 
Leu Thr Gly Leu Trp Tyr Phe Asp Val Arg Asn Asn Ser Leu Thr Gly 

215 220 225 

AGT ATA CCT GAG ACG ATA GGA AAT TGC ACT GCC TTC CAG GTT TTG GAC 776 
Ser He Pro Glu Thr He Gly Asn Cys Thr Ala Phe Gin Val Leu Asp 

230 235 240 

TTG TCC TAC AAT CAG CTA ACT GGT GAG ATC CCT TTT GAC ATC GGC TTC 824 
Leu Ser Tyr Asn Gin Leu Thr Gly Glu lie Pro Phe Asp lie Gly Phe 

245 250 255 

CTG CAA GTT GCA ACA TTA TCA TTG CAA GGC AAT CAA CTC TCT GGG AAG 872 
Leu Gin Val Ala Thr Leu Ser Leu Gin Gly Asn Gin Leu Ser Gly Lys 

260 265 270 

ATT CCA TCA GTG ATT GGT CTC ATG CAA GCC CTT GCA GTC TTA GAT CTA 920 
He Pro Ser Val lie Gly Leu Met Gin Ala Leu Ala Val Leu Asp Leu 



(8) #IPfr¥9- 5 6 382 

275 280 285 290 

AGT GGC AAC TTG TTG AGT GGA TCT ATT CCT CCG ATT CTC GGA AAT CTT 968 

Ser Gly Asn Leu Leu Ser Gly Ser He Pro Pro He Leu Gly Asn Leu 

295 300 305 

ACT TTC ACC GAG AAA TTG TAT TTG CAC AGT AAC AAG CTG ACT GGT TCA 1016 
Thr Phe Thr Glu Lys Leu Tyr Leu His Ser Asn Lys Leu Thr Gly Ser 

310 315 320 

ATT CCA CCT GAG CTT GGA AAC ATG TCA AAA CTC CAT TAC CTG GAA CTC 1064 
He Pro Pro Glu Leu Gly Asn Met Ser Lys Leu His Tyr Leu Glu Leu 

325 330 335 

AAT GAT AAT CAT CTC ACC GGT CAT ATA CCA CCA GAG CTT GGG AAG CTT 1112 
Asn Asp Asn His Leu Thr Gly His He Pro Pro Glu Leu Gly Lys Leu 

340 345 350 

ACT GAC TTG TTT GAT CTG AAT GTG GCC AAC AAT GAT CTG GAA GGA CCT 1160 
Thr Asp Leu Phe Asp Leu Asn Val Ala Asn Asn Asp Leu Glu Gly Pro 
355 360 365 370 

ATA CCT GAT CAT CTG AGC TCT TGC ACA AAT CTA AAC AGC TTA AAT GTT 1208 
lie Pro Asp His Leu Ser Ser Cys Thr Asn Leu Asn Ser Leu Asn Val 

375 380 385 

CAT GGG AAC AAG TTT AGT GGC ACT ATA CCC CGA GCA TTT CAA AAG CTA 1256 
His Gly Asn Lys Phe Ser Gly Thr He Pro Arg Ala Phe Gin Lys Leu 

390 395 400 

GAA AGT ATG ACT TAC CTT AAT CTG TCC AGC AAC AAT ATC AAA GGT CCA 1304 
Glu Ser Met Thr Tyr Leu Asn Leu Ser Ser Asn Asn lie Lys Gly Pro 

405 410 415 

ATC CCG GTT GAG CTA TCT CGT ATC GGT AAC TTA GAT ACA TTG GAT CTT 1352 
lie Pro Val Glu Leu Ser Arg He Gly Asn Leu Asp Thr Leu Asp Leu 

420 425 430 

TCC AAC AAC AAG ATA AAT GGA ATC ATT CCT TCT TCC CTT GGT GAT TTG 1400 
Ser Asn Asn Lys He Asn Gly He He Pro Ser Ser Leu Gly Asp Leu 
435 440 445 450 

GAG CAT CTT CTC AAG ATG AAC TTG AGT AGA AAT CAT ATA ACT GGT GTA 1448 
Glu His Leu Leu Lys Met Asn Leu Ser Arg Asn His He Thr Gly Val 

455 460 465 

GTT CCA GGC GAC TTT GGA AAT CTA AGA AGC ATC ATG GAA ATA GAT CTT 1496 
Val Pro Gly Asp Phe Gly Asn Leu Arg Ser He Met Glu He Asp Leu 

470 475 480 

TCA AAT AAT GAT ATC TCT GGC CCA ATT CCA GAA GAG CTT AAC CAA TTA 1544 
Ser Asn Asn Asp He Ser Gly Pro He Pro Glu Glu Leu Asn Gin Leu 

485 490 495 

CAG AAC ATA ATT TTG CTG AGA CTG GAA AAT AAT AAC CTG ACT GGT AAT 1592 
Gin Asn He He Leu Leu Arg Leu Glu Asn Asn Asn Leu Thr Gly Asn 

500 505 510 

GTT GGT TCA TTA GCC AAC TGT CTC AGT CTC ACT GTA TTG AAT GTA TCT 1640 
Val Gly Ser Leu Ala Asn Cys Leu Ser Leu Thr Val Leu Asn Val Ser 
515 520 525 530 

CAT AAC AAC CTC GTA GGT GAT ATC CCT AAG AAC AAT AAC TTC TCA AGA 1688 
His Asn Asn Leu Val Gly Asp He Pro Lys Asn Asn Asn Phe Ser Arg 

535 540 545 

TTT TCA CCA GAC AGC TTC ATT GGC AAT CCT GGT CTT TGC GGT AGT TGG 1736 
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Phe Ser Pro Asp Ser Phe He Gly 
550 

CTA AAC TCA CCG TGT CAT 
Leu Asn Ser Pro Cys His 
565 

GCA GCT ATT CTT 
Ala Ala He Leu 



TCT AGA 
Ser Arg 
580 
CTC ATG 
Leu Met 
595 

CTT GAT 
Leu Asp 

GTC ATC 
Val He 

AGA ATG 
Arg Met 

TCA AGC 
Ser Ser 
660 
ATT AAG 
He Lys 
675 

ACA GAA 
Thr Glu 

CTA CAA 
Leu Gin 

TAT TTG 
Tyr Leu 

AAA AAG 
Lys Lys 
740 
GCA CAA 
Ala Gin 
755 

CAC AGA 
His Arg 



GTC TTA 
Val Leu 

GGA TCA 
Gly Ser 

CTT CAT 
Leu His 
630 
ACA GAG 
Thr Glu 
645 

ACT GTA 
Thr Val 

CGG CTT 
Arg Leu 

CTC GAG 
Leu Glu 

GCT TAT 
Ala Tyr 
710 
GAA AAT 
Glu Asn 
725 

ACT CTT 
Thr Leu 



ATA GCA 
He Ala 
600 
CTT GAC 
Leu Asp 
615 

ATG AAC 
Met Asn 

AAT CTA 
Asn Leu 

TAC AAA 
Tyr Lys 

TAC TCT 
Tyr Ser 
680 
ATG CTA 
Met Leu 
695 

TCC CTC 
Ser Leu 

GGT AGC 
Gly Ser 

GAT TGG 
Asp Trp 



GAT TCT 
Asp Ser 
570 
GGA ATA 
Gly He 
585 

GCT TGC 
Ala Cys 

AAA CCA 
Lys Pro 

ATG GCA 
Met Ala 



GGT TTA 
Gly Leu 

GAC GTG 
Asp Val 



GCT CGT 
Ala Arg 

TCA CAT 
Ser His 



TTG ACA 
Leu Thr 
790 
ACT TCA 
Thr Ser 
805 



GCT TAT 
Ala Tyr 
760 
AAG TCG 
Lys Ser 
775 

GAT TTT 
Asp Phe 

ACT TAC 
Thr Tyr 



AGT GAG 
Ser Glu 
650 
TGT GTT 
Cys Val 
665 

CAC AAC 
His Asn 

AGT AGC 
Ser Ser 

TCT CAC 
Ser His 

CTC TGG 
Leu Trp 
730 
GAC ACA 
Asp Thr 
745 

CTA CAC 
Leu His 

TCC AAC 
Ser Asn 

GGA ATA 
Gly He 

GTG ATG 
Val Met 
810 



Asn Pro Gly 
555 

CGT CGA ACT 
Arg Arg Thr 

GCT ATT GGG 
Ala He Gly 

CGA CCG CAT 
Arg Pro His 
605 

GTA ACT TAT 
Val Thr Tyr 

620 
CTC CAC GTT 
Leu His Val 
635 

AAG TAT ATC 
Lys Tyr He 

TTG AAG AAT 
Leu Lys Asn 

CCA CAG TCA 
Pro Gin Ser 
685 

ATC AAG CAC 
He Lys His 

700 
TTG GGG AGT 
Leu Gly Ser 
715 

GAT CTT CTT 
Asp Leu Leu 

CGG CTT AAG 
Arg Leu Lys 

CAT GAC TGT 
His Asp Cys 
765 

ATT CTC TTG 
lie Leu Leu 

780 
GCG AAA AGC 
Ala Lys Ser 
795 

GGC ACG ATA 
Gly Thr He 



Leu Cys Gly Ser Trp 
560 

GTG TCA ATC 
Val Ser lie 



GTA CGA 
Val Arg 
575 
GGA CTT 
Gly Leu 
590 

AAT CCT 
Asn Pro 

TCG ACA 
Ser Thr 

TAC GAG 
Tyr Glu 

ATT GGG 
He Gly 
655 
TGT AAA 
Cys Lys 
670 

ATG AAA 
Met Lys 

AGA AAT 
Arg Asn 

CTT CTG 
Leu Leu 

CAT GGC 
His Gly 
735 
ATA GCA 
He Ala 
750 

AGT CCA 
Ser Pro 



GTG ATC CTT 
Val lie Leu 

CCT CCT TTT 
Pro Pro Phe 
610 

CCG AAG CTC 
Pro Lys Leu 

625 
GAT ATC ATG 
Asp He Met 
640 

CAC GGA GCA 
His Gly Ala 

CCG GTT GCG 
Pro Val Ala 

CAG TTT GAA 
Gin Phe Glu 
690 

CTT GTG AGC 
Leu Val Ser 

705 
TTC TAT GAC 
Phe Tyr Asp 
720 

CCT ACG AAG 
Pro Thr Lys 

TAT GGT GCA 
Tyr Gly Ala 



GAC AAA 
Asp Lys 



TTG TGT 
Leu Cys 

GGT TAC 
Gly Tyr 
815 



AGG ATC ATT 
Arg He I le 
770 

GAC TTA GAG 
Asp Leu Glu 

785 
GTG TCA AAG 
Val Ser Lys 
800 

ATA GAC CCC 
He Asp Pro 
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1784 

1832 

1880 

1928 

1976 

2024 

2072 

2120 

2168 

2216 

2264 

2312 

2360 

2408 

2456 

2504 
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GAG TAT GCT CGC ACT TCA CGG CTC ACT GAG AAA TCC GAT GTC TAC AGT 
Glu Tyr Ala Arg Thr Ser Arg Leu Thr Glu Lys Ser Asp Val Tyr Ser 

820 825 830 

TAT GGA ATA GTC CTT CTT GAG CTG TTA ACC CGA AGG AAA GCC GTT GAT 
Tyr Gly He Val Leu Leu Glu Leu Leu Thr Arg Arg Lys Ala Val Asp 
835 840 845 850 

GAC GAA TCC AAT CTC CAC CAT CTG ATA ATG TCA AAG ACG GGG AAC AAT 
Asp Glu Ser Asn Leu His His Leu He Met Ser Lys Thr Gly Asn Asn 

855 860 865 

GAA GTG ATG GAA ATG GCA GAT CCA GAC ATC ACA TCG ACG TGT AAA GAT 
Glu Val Met Glu Met Ala Asp Pro Asp He Thr Ser Thr Cys Lys Asp 

870 875 880 

CTC GGT GTG GTG AAG AAA GTT TTC CAA CTG GCA CTC CTA TGC ACC AAA 
Leu Gly Val Val Lys Lys Val Phe Gin Leu Ala Leu Leu Cys Thr Lys 

885 890 895 

AGA CAG CCG AAT GAT CGA CCC ACA ATG CAC CAG GTG ACT CGT GTT CTC 
Arg Gin Pro Asn Asp Arg Pro Thr Met His Gin Val Thr Arg Val Leu 

900 905 910 

GGC AGT TTT ATG CTA TCG GAA CAA CCA CCT GCT GCG ACT GAC ACG TCA 
Gly Ser Phe Met Leu Ser Glu Gin Pro Pro Ala Ala Thr Asp Thr Ser 
915 920 925 930 

GCG ACG CTG GCT GGT TCG TGC TAC GTC GAT GAG TAT GCA AAT CTC AAG 
Ala Thr Leu Ala Gly Ser Cys Tyr Val Asp Glu Tyr Ala Asn Leu Lys 

935 940 945 

ACT CCT CAT TCT GTC AAT TGC TCT TCC ATG AGT GCT TCT GAT GCT CAA 
Thr Pro His Ser Val Asn Cys Ser Ser Met Ser Ala Ser Asp Ala Gin 

950 955 960 

CTG TTT CTT CGG TTT GGA CAA GTT ATT TCT CAG AAC AGT GAG 
Leu Phe Leu Arg Phe Gly Gin Val He Ser Gin Asn Ser Glu 

965 970 975 

TAGTTTTTCG TTAGGAGGAG AATCTTTAAA ACGGTATCTT TTCGTTGCGT TAAGCTGTTA 
GAAAAATTAA TGTCTCATGT AAAGTATTAT GCACTGCCTT ATTATTATTA GACAAGTGTG 
TGGTGTGAAT ATGTCTTCAG ACTGGCACTT AGACTTCCAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAA 



[0057] mm^ : 2 

S?'I^«S : 9295 

h*ni;-:iia« 

Hiyi|<Oj!3i : genomic DNA 



2228. 



2367. 



: i/x*A (Arabidopsis thaliana) 

?#Sfc£fl~?1fE-^- : exon 
^PSfirg: 1803.. 1881 

: intron 
Sntft®: 1882.. 2227 



#ffl2M: 2540. 



exon 
.2366 

intron 
2467 

intron 
2643 



exon 

33£{iB: 2468.-2539 
miCDftWi : 

: exon 

3R£fi]«: 2644.. 2715 
im&mtKft: intron 



2552 



2600 



2648 



2696 



2744 



2792 



2840 



2888 



2936 



2978 



3038 
3098 
3158 
3176 
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ff2E{2M : 2716. .2809 

m\cotm : 
imtmrnm-. exon 

2810. .2878 

^ftSSr^-fie^ : intron 
#£fiM : 2879. .2968 

iwi«o«« : 

ttftLtmim: exon 
2969.. 3040 
Il?l|0*Si8 : 

1t«**"*fW: intron 
flFfiEftg: 3041.. 3118 

ttW.Zm.-n?M -. exon 
?PffifiS: 3119. .3190 

wnoim : 

»W&*«B#: intron 
#3E{2g: 3191.. 3266 
: 

IMS*****: exon 
#tt<4S : 3267.-3338 
W.m<Ott®. : 

ttmZWk-ni^-. intron 
ffltaS: 3339. .3421 

#S[^^-ne^: exon 
3422.-3493 

mW.*m-tm- intron 
If&mW. : 3494.-3586 

m®.Zm.^W% : exon 
IPltfig: 3587.-3655 
Stfijtf)#® : 

nm^m-tm^ : intron 
#£(4M: 3656.-3740 
ie?l]cO^Si : 
*M8£*-fiB* : exon 
#^£{4S: 3741.. 3812 
ie?i|c0^f® : 

t#gSt2r^-ne^ : intron 
?F£fi§: 3813.. 3888 

m\conm : 

1tS?r^-ne^: exon 
fiRtfiS: 3889.. 3960 

mnomm ■. 

ftrnzm-m^ ■■ intron 
3961.. 4048 



exon 

?R£&S: 4049.. 4120 
ffi^ft® : 

intron 
ffittfiS: 4121.. 4209 

wp\<m®. : 

exon 

fiqtftH: 4210.. 4281 
Efl<OW« : 

^®Sr^-f-ie-^ : intron 
3hGE<5S: 4282.-4349 
mmcoft®. : 
^SaSra-riE^ : exon 
?p£fiM: 4350.. 4421 
ffiflJcO^St : 

^StSr^-tlB^ : intron 
^mm.: 4422.. 4508 

wmttm ■ 

ftrntm-tm^- : exon 
?R£{iS: 4509.. 4580 

intron 
##&H: 4581.. 4706 
iWI<iO«« : 

mwizm-ttm -- exon 

ft&QM: 4707.. 4778 

mivmm ■. 

ttWli:$k-ttZ^ : intron 
SRSeS: 4779.. 4860 

: exon 

ftWSlM: 4861.. 4932 

smeomsL : 

m§l£3k-t§Z*§r : intron 
#3£&g: 4933.. 5018 

^SES-iS-ri^: exon 
*R£{4S: 5019.. 5090 
ffi^iJcO^a : 

IffSSrfl-fiB-^ : intron 
ffffifiS: 5091.. 5176 

1f«S5r^-ria^ : exon 
^fill: 5177.. 5248 

aw*** : 

fttS£*tlB*: intron 
^qflEftS: 5249.. 5412 

ft®£m.-?ttt : exon 
fRitiS: 5413.. 5481 
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mm<nvm • 



1#CS £ %T$W*7 : i ntron 
fiqQEtftB: 5482.-5576 



SfSttatEf: exon 
ff^firS: 6096.. 6443 



exon 
5577.-5648 



intron 
6012.. 6095 



#a £ ^^Ifi^- : i ntron 
#3E&B: 5649.-5726 



intron 
6444.. 6519 



nmzm-nm: exon 

#ffi(2B: 5727.. 5800 



mSSr^ie^: exon 
6520.. 6890 



1#a £ ^tti^ ' i ntron 
5801.. 5882 



^a^^Tlfi-^: intron 
^{JS: 6891.. 6974 



exon 

: 5883. .6011 



ffltZmtUfti exon 
6975.-7328 



GAATTCAAAG GAATAAGCAT CGGAGACGAT TTAATGTTAC CTCTTGAOGT ATTTATCCAA 
TTTATCCATT AAGCCACCAG CCATAGCATC TGATCATCAT CATCAACATA TAAATAACCA 
AATTTGAAAT GAACAAAAGT CGAATTGGTG ATATTGAAAA TCGAGTTCGT GAAATTGAGA 
ATCGGATTGG TGAATTTGAA GAGAGATGCG TGTACCGTTA GGGAGGAGGA GGAGACGGGA 
GAGAAAAAAG GAGACGGAGA TAACTCGCCG GCTCTGTTTC CATGGCGGAG GTGATAATGT 
AGCTGCGCAC GTTAGCTTTT TGTGGTTTGA GTTGGAGAAC AGTGGGAGGC TCACGGTAGC 
GTGGAGTGAC GACATTGGGG ATAACACCAG AGGCGTCTTA TCTCCGTTGG ACAAATTATT 
ATTATGGCTA TGAACATTCA ACATATAATT TAATTAGAAA TTTGCGGATG AAAAAGAGGT 
AAACAATTGC AGAAATGGTT AAAAATATTA ACGTTGTACA GCAAATGATA ATAAAAAGTG 
TAACGTACAG TGTGTAAGGA ATGGAAAAAT AATAATTTGG GTTAAAATAA ATATGTAGTT 
TTCTAACTAT ATAGTACTTT TTGAGAAAAG ATAATATTAT GTGTATTTTT ATTGAAACAA 
ATAAATGATT TAACAAAAAA AAAAAGAGAA GTTAAAATGA AAAGGAATTA TTATTTTTTA 
AGTTCTTCCT TCTTTTGTTG GGCCTGTGAC CCTTTTAGTT TTAGTCCACT TCGTTCTCAA 
AGCTTCAAAA TATTAATTTT GTGACAAACC GACCGGAGCC AACCAAACCG GTTAACATCC 
TAAAACCAAT CATATTTTAT TAAGTTTTGT GTTGATGCTA AACCAAAAAT CATTGGCATG 
CATATTTCTA AATTTAGTAA TAAACAAAAA CACTTAGAAA TCACACGTTC ACTATACTAA 
AAAACGTTGA CAAAAACACA ACAACTATAC TAATAATTAA AGAAGAGAAA ACTGAACCAA 
ACTTTTTGTA AACTCCTGAA TTTAAATTAG TAATTGAAGT AAGAAGATGA AGAAGAACAT 
GTTAAGCAAA CAAAAAAATT ACACTAAAAT CATATAAAAA TACATAATTA CAAAAGTACC 
CATAAGATGG ATTTATTGAT ATGGGTCATC TGTGAAACAA GCCACAGAGA GACAAAGACT 
CGTAAGTATT GGGCAACGAA AGCGACCTCC TTTATTCACC ACTGCCATTA ACATGTTCTT 
CTTCTCCTTC TTCTTCTACA TTTTATGACC GTTTTACCCT TCAAGAGAGA GAAACAAAAT 
CACTCCCTCT CACTCACTCr ATCTCTCTCT TCTGCAAAGC TTCAGAACTC TGGCAGAGAG 
ATAAAAGATG ATGGGGTTTT TAACTTTATC CTCCCCAAAT AATTCTTCTT CCCTTCATCT 
CTCTCTCTTA CACAACAGGT CCCTACATTT GTACAATCTC CTCTCTTTAA AGACTCTCTC 
TCTTTCTaC TCCATCTCTA TCTTACTCTG TATTTCTGTC GTCTGAGCAC TCAATGAAAC 
CACTGTAAAT TTCOGCCAGA ATTTGATGTG ATGGAACGAT AAAAATCATT TTTTCTCGGT 
TAAAGTAAAA AAACAAAAAC AAATTTCTGT AGAAATCATA ATAAAAGAAA GAAAAAAAAT 
CTAATGTCGG TACATAATAC GGTTCTCTTC TTCTTCTCTA TCCTCTGTTT CTTCTTCATG 
GAGACTTGAA AGCTTTTAAA GTATATCTAA AAACGCAGTC GTTTTAAGAC TGTGTGTGAG 
AAATGGCTCT GTTTAGAGAT ATTGTTCTTC TTGGGTTTCT CTTCTGCTTG AGCTTAGTAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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CTACTGTGAC TTCAGAGGAG GGTCAGTTAT TATACTGATG CATGCTTCTT CAAGTTCAAG 1920 

ATTTTCGTCT TTTTGTTTTA TATTAGTGAA AAAAACTTAA AGATGAGATT TTTATATGAT 1980 

TTTTGAAGTT TCATTTGGTG AAAATGAGAT CTGGGTACTT GTTATTTTCT ATTTTTGCTT 2040 

TTTGTAATGG TTTTTTTTTA CTTGGTGGGT CTTCTATAGA ATCAAAAGAA GCTTTGAATA 2100 

AATTAGGGTT TGAGTTTTAT TTTGTTTTCT TGGAAGTTGA ATTTTTAATC TTCTCAAGAA 2160 

CTGACAAATA TTTTTTTTTG TTTTTGTGCG TGTGTGTTAA TAAAATATCC TTAAAACAAA 2220 

ATTAAAGGAG CAACGTTGCT GGAGATTAAG AAGTCATTCA AAGATGTGAA CAATGTTCTT 2280 

TATGACTGGA CAACTTCACC TTCTTCGGAT TATTGTGTCT GGAGAGGTGT GTCTTGTGAA 2340 

AATGTCACCT TCAATGTTGT TGCTCTGTAA GTTTCTTCAT TCCTTTAGAT TACTATTACA 2400 

GTGGTTTTTG GTGTTCTTGT GGGAAAAAGT TGTAATTTGT TTTGTGTGTG TTTTCTATGT 2460 

TTTGTAGTAA TTTGTCAGAT TTGAATCTTG ATGGAGAAAT CTCACCTGCT ATTGGAGATC 2520 

TCAAGAGTCT CTTGTCAATG TAACTGTTTC AACATTCACT GTAGCATGAA ATAAAGTATC 2580 

TTACTTTAAT TCTATTCCAC TCTCTGAGTT GTGACTTTTG TCTTCTGTTT TTTTCTAATG 2640 

TAGTGATCTG CGAGGTAATC GCTTGTCTGG ACAAATCCa GATGAGATTG GTGACTGTTC 2700 

TTCTTTGCAA AACTTGTAAG AACAGTGATT GGTGTTATTC TACCATTAAA CTTTTGTTCA 2760 

TAGAGGTTTT ATTTGATGAA GTGTGTTCAT GTTGTTTTTA ATTCAGAGAC TTATCCTTCA 2820 

ATGAATTAAG TGGTGACATA CCGTTTTCGA TTTCGAAGTT GAAGCAACTT GAGCAGCTGT 2880 

AAGTAGCTAG TTATTCTGCT ACTAGTCTTC ATATGTCATT GCTAAAAATA TACTCACCAT 2940 

GTGGAATATG GATTTTTACT TTGTCCAGGA TTCTGAAGAA TAACCAATTG ATAGGACCGA 3000 

TCCCTTCAAC ACTTTCACAG ATTCCAAACC TGAAAATTCT GTATGTTCCC CATGATTCTT 3060 

ACATGTCTTA CTACTTTTAG CTATATAGGT GATCATACAT GTGTAATTTC AATTGCAGGG 3120 

ACTTGGCACA GAATAAACTC AGTGGTGAGA TACCAAGACT TATTTACTGG AATGAAGTTC 3180 

TTCAGTATCT GTAAGTGTCA ATGTTTTTTG AAGTCTGTCA ATGTCTCTTC ATTACCCGGT 3240 

GATAATTGTT GTACTATGAT GAGCAGTGGG TTGCGAGGAA ACAACTTAGT CGGTAACATT 3300 

TCTCCAGATT TGTGTCAACT GACTGGTCTT TGGTATTTGT GAGTCTTCTT GCACATCTGA 3360 

ATAGTATGAT GAGTTCTTTT GTAAATATCA AATATCTGAC TTTGTTTTGA TATTGAATCA 3420 

GTGACGTAAG AAACAACAGT TTGACTGGTA GTATACCTGA GACGATAGGA AATTGCACTG 3480 

CCTTCCAGGT TTTGTATGTG CCTCTTTCTC TACTTCTAAA CATCATTACT GTAATTTGGG 3540 

TTACTTAAGA AAATCTACTT AACTGGTTTG CTTATTACGA ACTCAGGGAC TTGTCCTACA 3600 

ATCAGCTAAC TGGTGAGATC CCTTTTGACA TC3GGCTTCCT GCAAGTTGCA ACATTGTTAG 3660 

TTCTCACCTC TACTAATCTT TTGCTTTAAA TTTTGGCTAG CCTTTGTTTT CTTTTAAAGA 3720 

AGATCATTTT CTTATCTTAG ATCATTGCAA GGCAATCAAC TCTCTGGGAA GATTCCATCA 3780 

GTGATTGGTC TCATGCAAGC CCTTGCAGTC TTGTAAGTAC TTTTCTTCTA ATCAATGAAG 3840 

CTACTTATAA CATTTTCATG AACTTAGGTT ATATGTTTTC TTTTACAGAG ATCTAAGTGG 3900 

CAACTTGTTG AGTGGATCTA TTCCTCCGAT TCTCGGAAAT CTTACTTTCA CCGAGAAATT 3960 

GTAATTCTTT ACCTGTTTGT TTTCAGTTTG GAGTCAAATG TCATACCATG TTAATGATAG 4020 

TGATTTATCT TTTTGGCTTT ATCTCTAGGT ATTTGCACAG TAACAAGCTG ACTGGTTCAA 4080 

TTCCACCTGA GCTTGGAAAC ATGTCAAAAC TCCATTACCT GTATGACCAA CCTTCTCTTC 4140 

ACTTCTCTTT TTGCATACAG TCACTACTAA GTTGTGTTTC CTTATCAACT ATTTGTAAAA 4200 

TATTCATAGG GAACTCAATG ATAATCATa CACGGGTCAT ATACCACCAG AGCTTGGGAA 4260 

GCTTACTGAC TTGTTTGATC TGTAAGTAGT TCTTCCTATG CTTGACATGT TTTGATGTTC 4320 

TTATGCTTAT ATGAACTATG TACATATAGG AATGTGGCCA ACAATGATCT GGAAGGACCT 4380 

ATACCTGATC ATCTGAGCTC TTGCACAAAT CTAAACAGCT TGTATGTATC TCTTTCTCTG 4440 

AAAACTTCTC ACTTGAATGT TCAAGATTGG TGCTTTATAT GATTTTGTGT CTCATTAATG 4500 

TAATGTAGAA ATGTTCATGG GAACAAGTTT AGTGGCACTA TACCCCGAGC ATTTCAAAAG 4560 

CTAGAAAGTA TGACTTACCT GTAAGTATCG ACGCTGAGAA TTTCTCTAAT CTTATATAAT 4620 

ATATAGTTCC ACAGCGTTTG TTTTTTCGAA TTTCAAGTCA TTAACTACTG AGTTTTTGGT 4680 

TGCCTTTGAT TTATCGGTTC AACCAGTAAT CTGTCCAGCA ACAATATCAA AGGTCCAATC 4740 

CCGGTTGAGC TATCTCGTAT CGGTAACTTA GATACATTGT AAGTGTTTCT TGTTTT CT GT 4800 

GAAGTATACA TCATTATATG TGCCTTGTCT CACATTTATT AAATTTAATG ACATTTGAAG 4860 
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GGATCTTTCC AACAACAAGA TAAATGGAAT CATTCCTTa TCCCTTGGTG ATTTGGAGCA 4920 

TCTTCTCAAG ATGTGAGCAT CCATAAGACC TCCAGTTTTA TTGTTTATTT CTAGCAAAAG 4980 

ATGAAAATGG TTTGTGAACT CTTGCATTCT TGTTATAGGA ACTTGAGTAG AAATCATATA 5040 

ACTGGTGTAG TTCCAGGCGA CTTTGGAAAT CTAAGAAGCA TCATGGAAAT GTAAGAAGTT 5100 

AACTTCTATC TGCTTGGTTA GAGTTTTTTT CATTTATCTC AATTACTGTT CTGAATTTGT 5160 

GTGTTTGTGG TTGCAGAGAT CTTTCAAATA ATGATATCTC TGGCCCAATT CCAGAAGAGC 5220 

TTAACCAATT ACAGAACATA ATTTTGCTGT AAGCAATCTT CCTCTTATCC CTTCCAAG CT 5280 

GTTAAGAAAT TGTTTTTGTA GAATGAAACT AAAACTCTGT ATACACAATA ATGAGGTCAC 5340 

TATAGTGTGA TCCAGGAACA TGTATTGGGT TGGTGATCTA TCTAATGTTG TGTTTCTTAA 5400 

AATTGCTTGC AGGAGACTGG AAAATAATAA CCTGACTGGT AATGTTGGTT CATTAGCCAA 5460 

CTGTCTCAGT CTCACTGTAT TGTAAGTAGG CACCTTTGGT TCTGAAACAT TTTTTGTCCC 5520 

TCTTTGTGCA TCTTTTGCTA AGAATATAAC CCTGCAATCT TCACTAACTC TTATAGGAAT 5580 

GTATCTCATA ACAACCTCGT AGGTGATATC CCTAAGAACA ATAACTTCTC AAGATTTTCA 5640 

CCAGACAGGT ATGGTAATTT AGCAGGTTTT GGTATTGTGC ATTTTGTTTT GTTTGCTAAT 5700 

ATCTATGTTT ATGTTTTTGG ATAAAGCTTC ATTGGCAATC CTGGTCTTTG CGGTAGTTGG 5760 

CTAAACTCAC CGTGTCATGA TTCTCGTCGA ACTGTACGAG GTGATTACAT TCTTCTAAAA 5820 

GCTTCCATTC ACAAAACCTA AGATAATTAA AGCTCATGTT TCTATCCATG TTTTGTCTGC 5880 

AGTGTCAATC TCTAGAGCAG CTATTCTTGG AATAGCTATT GGGGGACTTG TGATCCTTCT 5940 

CATGGTCTTA ATAGCAGCTT GCCGACCGCA TAATCCTCCT CCTTTTCTTG ATGGATCACT 6000 

TGACAAACCA GGTCT ACT CT CCAAACCACT TTACGAATGT TCTTCACCTA CAATGTAATC 6060 

CAATAGTTAA TCCTTAAATT TCCTGGTGAC ATCAGTAACT TATTCGACAC CGAAGCTCGT 6120 

CATCCTTCAT ATGAACATGG CACTCCAOGT TTACGAGGAT ATCATGAGAA TGACAGAGAA 6180 

TCTAAGTGAG AAGTATATCA TTGGGCACGG AGCATCAAGC ACTGTATACA AATGTGTTTT 6240 

GAAGAATTGT AAACCGGTTG CGATTAAGCG GCTTTACTCT CACAACCCAC AGTCAATGAA 6300 

ACAGTTTGAA ACAGAACTCG AGATGCTAAG TAGCATCAAG CACAGAAATC TTGTGAGCCT 6360 

ACAAGCTTAT TCCCT CT CTC ACTTGGGGAG TCTTCTGTTC TATGAaATT TGGAAAATGG 6420 

TAGCCTCTGG GATCTTCTTC ATGGTAAGTC TCATCGCCAA ACATAGAAAA TTATTTGAAT 6480 

CTTCTGTGAC ATAACAACTT GCTTGTGTGT TTTGTAAAGG CCCTACGAAG AAAAAGACTC 6540 

TTGATTGGGA CACACGGCTT AAGATAGCAT ATGGTGCAGC ACAAGGTTTA GCTTATCTAC 6600 

ACCATGACTG TAGTCCAAGG ATCATTCACA GAGACGTGAA GTCGTCCAAC ATTCTCTTGG 6660 

ACAAAGACTT AGAGGCTCGT TTGACAGATT TTGGAATAGC GAAAAGCTTG TGTGTGTCAA 6720 

AGTCACATAC TTCAACTTAC GTGATGGGCA CGATAGGTTA CATAGACCCC GAGTATGCTC 6780 

GCACTTCACG GCTCACTGAG AAATCCGATG TCTACAGTTA TGGAATAGTC CTTCTTGAGT 6840 

TGTTAACCCG AAGGAAAGCC GTTGATGACG AATCCAATCT CCACCATCTG GTTTGTTCTT 6900 

TCTTGCCTAT CTCTCTCAGC TGCTCTGTTT AGGTCAAGTC CGTAATCTTG TTTTCATTGA 6960 

TTCACTTACA TCAGATAATG TCAAAGACGG GGAACAATGA AGTGATGGAA ATGGCAGATC 7020 

CAGACATCAC ATCGACGTGT AAAGATCTCG GTGTGGTGAA GAAAGTTTTC CAACTGGCAC 7080 

TCCTATGCAC CAAAAGACAG CCGAATGATC GACCCACAAT GCACCAGGTG ACTCGTGTTC 7140 

TCGGCAGTTT TATGCTATCG GAACAACCAC CTGCTGCGAC TGACACGTCA GCGACGCTGG 7200 

CTGGTTCGTG CTACGTCGAT GAGTATGCAA ATCTCAAGAC TCCTCATTCT GTCAATTGCT 7260 

CTTCCATGAG TGCTTCTGAT GCTCAACTGT TTCTTCGGTT TGGACAAGTT ATTTCTCAGA 7320 

ACAGTGAGTA GTTTTTCGTT AGGAGGAGAA TCTTTAAAAC GGTATCTTTT CGTTGCGTTA 7380 

AGCTGTTAGA AAAATTAATG TCTCATGTAA AGTATTATGC ACTGCCTTAT TATTATTAGA 7440 

CAAGTGTGTG GTGTGAATAT GTCTTCAGAC TGGCACTTAG ACTTCCTATA AGTTCTTGCC 7500 

TATCTAAGTT TTTCTAAATT GGGTTATTCT TGTAACATAT CTTAG AT CT A GTACTCAACA 7560 

CCACGTCACC ACCACAAAAG ATTTCTTATG CTCAAAAACA TATACATAGA AAGAACCFTC 7620 

TAAACTACGA GAAACGTTTT GCTATGTAGT GTTATATGTC AACCACGTCT ATGAGAGTGC 7680 

AAACGATAGG TTAATAAGTT TTCTCACTTG GCAATAAAAA TGATAAACAA ATATATTGTC 7740 

TGATTAATTT ATTTTATATA GTTTTTTTAT AATTTCTTAT ATTAATTCGA ACTCATACAG 7800 

CGCGTGAGAC TTTCTAGTTT AGTATAAAGT ACGTATTTTT GCAAAATCAA AATCGTAAAT 7860 
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ACATACATTT 
CTAAGATTTT 
CTAACTTTGG 
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The Arabidopsis ERECTA Gene Encodes a Putative 
Receptor Protein Kinase with Extracellular 
Leucine-Rich Repeats 

Kelko U. Torfi,*' 12 Norihiro Mltsukawa* 2 Teruko Oosuml,"' 3 Yutaka Matsuura, c Ryusuke Yokoyama, c 
Robert F. Whlttier, b and Yoshibumf Komeda*-* 

a Molecular Genetics Research Laboratory. University of Tokyo, Hongo, Tokyo 113, Japan 

b Mitsui Plant Biotechnology Research Institute, TCI-D21, Sengen, Tsukuba 305, Japan 

c Division of Biological Sciences, Graduate School of Science, Hokkaido University, Sapporo 060, Japan 



Arabidopsis Landsberg erect a is one of the most popular ecotypes and is used widely for both molecular and genetic 
studies. It harbors the erects (er) mutation, which confers a compact inflorescence, blunt fruits, and short petioles. We 
have identified five er mutant alleles from ecotypes Columbia and Wassilewskl|a. Phenotypic characterization of the mu- 
tant alleles suggests a role for the ER gene in regulating the shape of organs originating from the shoot apical merlstem. 
We cloned the ER gene, and here, we report that it encodes a putative receptor protein kinase. The deduced ER protein 
contains a cytoplasmic protein kinase catalytic domain, a transmembrane region, and an extracellular domain consisting 
of leucine- rich repeats, which are thought to interact with other macro molecules. Our results suggest that cell-cell com- 
munication mediated by a receptor kinase has an important role in plant morphogenesis. 



INTRODUCTION 



The form of higher plants is the consequence of the repetitive 
divisions and subsequent differentiation of the cells produced 
by the shoot apical meristem. The shoot apical me ri stem keeps 
initiating new organs throughout the life of plants, while main- 
taining itself as a formative region. Organ primordia are derived 
from numerous ceils that originate from multiple lineages 
(Szymkowiak and Sussex, 1992). These cells coordinate their 
growth patterns to develop determinate organs. Thus, cell-cell 
signaling is crucial in determining organ shape. 

The molecular nature of these signals for cell-cell commu- 
nication Is not fully understood. Recent molecular genetic 
studies using Antirrhinum and maize have, however, identi- 
fied genes that potentially mediate cell-cell communication. 
Mosaic analyses using the maize leaf mutants Teopod, Rough 
sheath, and Knotted indicate that the gene products may act 
non-cell autonomously (Sinha and Hake, 1990; Dudley and 
Poethlg, 1993; Bee raft and Freeling, 1994). KNOTTED may 
be able to move from mesophyll (L2) cells, which express KNOT- 
TED, to the epidermal (L1) cells, which do not express KNOTTED 
(Jackson et al., 1994). In Antirrhinum, a mosaic analysis of 
ftorfcaula has also shown that it can act non-cell autonomously 
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in the floral meristem (Carpenter and Coen, 1995; Hantke et 
al., 1995). His not known how Teopod, Rough sheath, or ftoncauta 
signal to the surrounding cells to coordinate organogenesis. 

We have taken a genetic approach to determine the mech- 
anism specifying organ shape. At the vegetative stage, an 
Arabidopsis shoot apical meristem produces leaves and axil- 
lary meiistems. Upon entering the reproductive stage, the shoot 
apical meristem converts into the inflorescence meristem, 
which then produces floral meiistems. In the typical rosette 
plant Arabidopsis, a transition from vegetative to reproductive 
development accompanies the elongation of the inflorescence 
stalk (i.e., bolting). Generation of floral buds and subsequent 
elongation of the irrte modes seem tightly coupled, thus produc- 
ing a highly ordered branching pattern. We have performed 
a mutant screen tor altered inflorescence branching patterns 
(Tsukaya et al., 1993; Komeda and Torii, 1994) and isolated 
five mutations allelic to Landsberg erects (Lor). 

Ler is one of the most popular ecotypes of Arabidopsis and 
has been widely used for both molecular and genetic studies 
(Hwang et aJ., 1991; Anderson and Mulligan, 1992). Ler was 
Isolated from mutagen Ized seed populations in the 1950s 
(Redei, 1992). It harbors the erects (er) mutation and shows 
an altered organ shape. Ler develops a very compact inflo- 
rescence with flowers clustering at the top. Ler plants also 
display round leaves with short petioles and short and blunt 
siliques (Redei, 1992; Bowman, 1993). The compact stature 
of Ler is preferred by genetists, and thus Ler has been used 
as a wild-type strain to isolate numerous mutants. These 
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Include mutants in photo morphogenesis, phytohormone bio* 
synthesis and signal transduction, and flower organ identity. 
Many such mutants have been characterized, despite the fact 
that no one knew the nature of the er mutation. 

As an initial step toward understanding the molecular mech- 
anism regulating the specific organ shape, we isolated the ER 
gene. The ER gene encodes a putative receptor protein kinase 
with an extracellular ligand binding domain, implicating the 
existence of an intercellular signal transduction pathway that 
is required for proper development of organs derived from the 
shoot meristem. 



RESULTS 



Phenotypes of er Mutants 

We isolated new mutant alleles at an er locus from ecotypes 
Columbia (Col) and Wassilewskija (WS). Plants homozygous 
tor all er alleles show significantly compact inflorescences com- 
pared with those of the wild types (Figure 1A). Inflorescence 
stems are thicker in plants homozygous for er alleles when 
compared with the wild types (data not shown). It seems that 
the short and thick inflorescence stem phenotype makes er 
mutants "erect." Flower buds are clustered at the top of the 
inflorescence In plants homozygous for each er allele without 
affecting phytlotaxis (Figure 2). Moreover, the number of flower 
buds at the first flowering was increased in er mutants (see 
the legend to Figure 2). Thus, the er mutation may somehow 
affect the coordination of stem elongation and flower bud for- 
mation. The number of lateral inflorescences was not altered 
in plants homozygous for all er alleles, suggesting that the er 
mutation may not affect apical dominance (data not shown). 
Siliques are blunt, short, and wider in plants homozygous for 
all er alleles (Figures 1C, 1D, and 3). Because flowers of er 
and wild-type plants are similar (Figure 2), it is likely that the 
er mutation affects the elongation of carpels after fertilization, 
er mutants also have very short pedicels compared with those 
of the wild types (Figure 1B). 

Leaf morphology is varied in plants homozygous for each 
er allele (data not shown). Ler has round leaves with a short 
petiole, as previously described (Redei, 1992; Bowman, 1993). 
Leaves of er-102 are small and curly, and these traits coseg- 
r eg ate with the other phenotypes described above (data not 
shown). In contrast, leaves of er-101 and er-104 seem less 
affected by the mutations, and leaves of er~103 are almost 
indistinguishable from those of the wild types (data not shown). 
The leaves of er-101Aar-102 heterozygous plants display an in- 
termediate phenotype (data not shown). No difference was 
observed in roots of wild-type plants and er mutants (data not 
shown). From these observations, a corresponding wild-type 
ER gene is likely to be required for proper elongation of vari- 
ous organs of shoot meristem origin. 

Precise phenotypic analysis revealed the degree of severity 



among er alleles (Figures 1A to 1 D). er-102 plants are the most 
compact (Figure 1A), having the shortest petioles (Figure 1B) 
and the shortest and widest siliques (Figures 1C and 1D). Thus, 
we conclude that this is the most severe allele. In contrast, 
er-103 plants are taller (Figure 1A), having long siliques (Fig- 
ure 1C) of the same width as those of the wild types (Figure 
1D). Therefore, er-103 seems to be the weak allele. The severity 
of the er-101, er-103, er-105 t and Ler alleles seems to be ap- 
proximately the same (Figures 1 A to 1 D), except that Ler plants 
tend to have longer pedicels (Figure 1B). 



Molecular Identification of the ER Locus and Defects 
in the er Alleles 

The er-104 allele, generated by T-DNA insertional mutagene- 
sis, was used to isolate the ER gene. Initially, er-104 harbored 
two independent T-DNA insertions. We backcrossed er-104 
twice into wild-type Col and obtained a line that has a single 
T-DNA insertion. Genetic analyses indicated that the inser- 
tion is tightly linked to the er locus (data not shown). By using 
the T-DNA as a probe, DNA get blot analysis revealed a com- 
plex insertion of 25 T-DNA copies with left borders at both 
genomic DNA junctions (data not shown). By using the pBR322- 
de rived replication origin and ampttiilirwesistance marker pres- 
ent in this portion of the T-DNA, both genomic junctions were 
recovered separately by plasmid rescue (Figure 4A). They 
showed a polymorphism between er-104 and the WS wild type 
(Figure 4B) and hybridized with six of eight yeast artificial chro- 
mosome clones (namely, EG1D5, EG2A1, EG2B1, EG10A10, 
EG10H3, and EG16C6), which contain the GPA1 locus, an 
Arabidopsis G protein a subunit gene located within 1 cen- 
timorgan of the er locus (data not shown) (Ma et a!., 1990; 
Hwang et al., 1991; Hwang and Goodman, 1995). 

Two independent transcripts of 33 and 0j6 kb were found 
within the 13-kb region used to screen the cDNA library (Fig- 
ure 4A). RNA gel blot analysis revealed the absence of the 
&3-kb transcript in er-104 and er-105, whereas the expression 
level of the OB-kb transcript was not affected in these alleles 
(data not shown), suggesting that the 33-kb transcript is most 
likely the ER transcript. 

Six cDNA clones corresponding to the 33-kb transcript were 
isolated from an Arabidopsis Col cDNA library. The longest 
cDNA is 3176 bp and contains a single open reading frame 
of 976 amino acid residues with a calculated molecular mass 
of 1073 kD (Figure 5). The presence of an in-frame stop codon 
upstream of the first ATG confirmed that this is the initiation 
codon. Comparison of genomic and cDNA sequences revealed 
the presence of 26 introns (Figures 5 and 6A). Comparison 
of cDNA and plasm id- rescued DNA revealed that the T-DNA 
was inserted In the 5' untranslated region (Figure 5). The T-DNA 
insertion was associated with the deletion of 28 nucleotides, 
from -6 to +22 of the 5' terminus of the full-length ER cDNA 
(Figure 5), suggesting that the transcriptional initiation point 
was deleted, er-705 was generated by fast-neutron irradiation 
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Figure 1. Comparison of the Inflorescence, Pedicel, and Silique Lengths and the Silique Widths of er Mutants and Wild-Type Plants. 

(A) The length of the main inflorescence of 40-day-old wild-type (wt) plants and plants homozygous for each er allele. Length was measured 
and designated as plant height. At least 27 plants with each er allele and the wild-type plants were measured. 

(B) The pedicel length of 40-day-okJ plants. Ten pedicels from the base of the main inflorescence of five individual plants (total of 50 pedicels) 
with each er mutant allele and the wild-type plants were measured. 

(C) Silique length. Ten fulry expanded siliques from five individual plants (total of 50 siliques) with each er allele and the wild-type plants were 
measured before desiccation. 

(D) Silique width. Fifty siliques, as given above for (B) and (C), were measured. 
The mean values are shown. Error bars represent standard deviation. 



738 The Plant Cell 




Figure 2. inflorescence Morphology of er Mutants and Wild-Type Plants. 

Shown are top views of inflorescences at first flowering. 

(A) Wild-type Col. 

(B) eMOl. 

(C) er-W2. 
(0) er /oa 
(£) eMOS. 

(F) Lor. 

(G) Wild-type WS. 

(H) er-704. 

Flower buds of plants homozygous for all er alleles are tightly clustered at the top compared with those of the wild-type plants. Numbers of flower 
buds at first flowering were 19, 21. and 21 for er-101, er-102, and er-704. respectively, and 15 and 16 in Col and WS wild types, as shown. 8ars « 500 *im. 



and was therefore expected to result from a gross DNA re- 
arrangement. We performed DNA gel blot analysis with er-105 
(Figure 4Q and found that ~4 kb of DNA of unknown origin 
is inserted within the ER locus (Figures 4A and 4C) A precise 
polymerase chain reaction analysis determined the region of 
insertion between +5 and + 1056 after the first ATG for trans- 
lation in the genomic sequence (data not shown). Molecular 
defects in the er-104 and er-105 alleles are consistent with the 
absence of the transcripts (Figure 7A). None of the other er 
alleles or Ler showed polymorphism with wild-type Col or WS 
when their genomic DNA was analyzed by DNA gel blotting 
(data not shown). The ER gene is most likely a single copy 
(Figure 4C). 

To confirm further that we had cloned the ER gene, two ad- 
ditional alleles were characterized at the molecular level. 
Ethylmethane sulfonate-generated er-703 has a G-to-A trans- 



version at position +846, which changes amino acid 282 from 
methionine to isoleucine (Figures 6A and 6B). In Ler, a T-to-A 
transversion at position +2249 was found, and this change 
results in a substitution of lysine for isoleucine at amino acid 
750 (Figures 6A and 6C). Ler also contains two silent muta- 
tions (T-to-C transversion at positions +1389 and +1608), 
which are most likely due to the polymorph ism between Col 
and Ler ecotypes. 



The ER Gene Encodes a Putative Receptor Protein 
Kinase with a Ligand Binding Domain 

The deduced amino acid sequence of ER shows characteris- 
tics of a transmembrane receptor protein kinase with distinct 
domains (Figures 6A to 6C). Two hydrophobic domains are 
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present at the N terminus (amino acids 1 to 20) and between 
amino acids 580 and 602 (Figures 5 and 6A). These are con- 
sistent with a signal peptide and a transmembrane domain, 
respectively (Weinstein et al.. 1982; von Heijne, 1983). The 
C-terminaJ cytoplasmic region (amino acids 648 to 914) com- 
prises a putative catalytic domain of protein kinase (Figures 
5 and 6C) (Hanks and Quinn, 1991). A putative extracellular 
domain (amino acids 75 to 530) contains 20 tandem copies 
of a 24-amino acid leucine-rich repeat (LRR) (Figures 5 and 
68). These repeats have been implicated to play a role in pro- 
tein-protein interactions (Kobe and Deisenhofer, 1994). Each 
unit of the LRR is encoded by identically sized exons, and in- 
trons of similar sizes are present at the exact same position, 
between the second and third nucleotides of the codon for 
leucine at position 13 (underlined) in the consensus P*"LG- 
*L*'L**L*L*"N*rG*l (asterisks represent nonconserved amino 
acids) (Figures 5 and 68). Thus, this domain has most likely 
evolved by exon duplication. The mutation in eM03 occurs in 
a consensus position of the LRR, changing methionine to 
isoleucine (Figures 6A and 68). Both amino acids are similar, 



but the latter lacks sulfur. Thus, the mutation may alter the struc- 
ture of the LRR domain slightly, possibly affecting the 
receptor-ligand interaction and resulting in the weak pheno- 
type. The presence of 15 N-gfyeosylation sites (Asn-X-Ser/Thr) 
(Figure 5) suggests that ER is a glycosylated protein. 

The protein kinase domain of ER has all 1 1 conserved sub- 
domains of eukaryotic protein kinases and all invariant amino 
acid residues in their proper positions (Hanks and Quinn, 1991). 
This domain of ER is most closely related to the predicted 
receptor-like protein kinases (RLKs) in higher plants: maize 
ZmPKl (36% identity; Walker and Zhang, 1990), Brassica 
SRK6 (32% identity; Stein et al., 1991), and Arabidopsis TMK1 
(35% identity; Chang et al., 1992) and RLKS (40% identity; 
Walker, 1993). ER appears to fall into the serine/threonine class 
of protein kinases, because it contains diagnostic sequences 
of this famity (subdomain Vlb and VIII) (Hanks and Quinn, 1991), 
and SRK6. TMK1, and RLK5 are demonstrated to have ser- 
ine/threonine substrate specificity (Chang et al.. 1992; Stein 
and Nasrallah, 1993; Horn and Walker, 1994). A point muta- 
tion in Ler changes isoleucine, a highly conserved amino acid 




Figure 3. Morphology of Fully Expanded Siliques of er Mutants and Wild-Type Plants. 

(A) Wild-type Col. 
(8) er-Wi 

(C) er-102. 

(D) Ler. 

(E) Wild-type WS. 

(F) er-104. 

Siliques from plants homozygous tor ail er alleles are blunt and short compared with those of the wild-type plants. Bar in (A) 
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Figure 4. Structure of the ER Region and DMA Gel Blot Analysis. 

(A) Shown are the restriction map and genomic structure of the ER 
locus and its flanking region. Fragments used to screen yeast artifi- 
cial chromosomes and genomic Pi libraries are indicated by the open 
bars, and fragments used to screen the cDNA library are indicated 
by the hatched bars. Arrows indicate the orientation and lengths of 
transcripts from 5' to 3' The insertion site of T-DNA in er-W4 and the 
insertion site of DNA of unknown origin in er-105 are also indicated. 
E, EcoRI; K, Kpnl; S, Sad; Sc, Seal; Xb, Xbal; Xh, Xhol. 

(B) DNA gel blot analysis using plasm id- rescued fragments as a probe 
revealed a polymorphism between wild-type WS <wt) and a T-DNA- 
tagged allele, er-704. Probe R was used for hybridization. One micro- 
gram of total genomic DNA was digested with Xbal (Xb) and Xhol (Xh). 
T-DNA, originating from the transformation vector pGDW32 (Wing et 
al., 1989), has one Xbal site but no Xhol site. 

(C) DNA gel blot analysis with the ER cDNA as a probe revealed a 
genomic rearrangement in a fast-neutron allele, er-105. One micro- 
gram of total genomic DNA was digested with Xbal (Xb) and Seal (Sc). 
In (B) and (C), molecular length standards are indicated at left in 
kilobases. 



residue at subdomain Via, into lysine (Figure 60). Nine of 10 
previously reported plant RLKs have isoleucine at this posi- 
tion (Walker, 1994). One exception, TMK1, has leucine, which 
is similar to isoleucine (Figure 6C) (Chang et al., 1992). There- 
fore, it is most likely that Ler disrupts proper activity of a 
functional receptor kinase. 
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Figure 5. Nucleotide and Deduced Amino Acid Sequence of the ER 
Gene. 

The positions of introns, determined by comparison of cDNA and 
genomic DNA sequences, are indicated by arrows. The putative sig- 
nal peptide and membrane-spanning regions are underlined. The 
possible N-glycosylation sites (Asn-X-Ser/Thr) are indicated by aster- 
isks. The open box indicates nucleotides that are missing in the genomic 
DNA of er-704 due to a T-DNA insertion. The first nucleotide of the 
translation initiation site and the first amino acid of the ER protein are 
numbered 1. The GenBank/EMBL/DDBJ accession numbers for the 
wild-type ER cONA and genomic DNA sequences are U47029 and 
D83257, respectively. 
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Figure 6. Structure of the Predicted ER Protein and Conserved Features of ER Amino Acid Sequence. 

(A) Structure of the ER gene and the predicted ER protein. The boxes correspond to the exons, and the thick lines correspond to the introns. 
The hatched boxes represent a signal sequence and a transmembrane region, the dotted boxes represent LRRs, the horizontally lined boxes 
represent the putative protein kinase domain, and the closed boxes represent untranslated regions. Numbers at the bottom indicate the starting 
and ending positions of amino acids in each domain. The amino acids mutated in two different alleles are also indicated. 

(B) The alignment of LRR repeats in the ER protein. Residues that appear at each position at >60% frequency are shown by black boxes. The 
positions of introns are indicated by an arrow. An amino acid mutated in er-103 is indicated by a boldface lowercase m. At the bottom is a compari- 
son of the LRR consensus sequence of ER with the consensus sequences of other LRR-containing proteins (Kobe and Deisenhofer, 1994). Asterisks 
designate nonconserved amino acids, and dashes designate gaps. 

(C) Homology alignments of the cytoplasmic kinase domain of ER with the kinase domains of ZmPKl (Walker and Zhang, 1990), SRKB (Stein 
et al., 1991), TMK1 (Chang et al., 1992), and RLKS (Walker, 1993), the four receptor-like protein kinases in plants. Highly conserved residues 
identical to those in ER (more than three of four) are indicated by black boxes. The positions of 11 protein kinase subdomains (Hanks and Quinn, 
1991) are indicated by Roman numerals, and the invariant amino acids (Hanks and Quinn, 1991) are indicated by asterisks. Dashes designate 
gaps. The mutation observed in Lor is indicated above the altered residue, amino acid 750. 
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This result is consistent with the DNA gel blot analysis in which 
those four alleles displayed the exact same pattern as those 
of wild-type Col and WS (data not shown). 

ER was expressed at the highest level in young floral buds, 
including inflorescence me ri stems (i.e., bud clusters), at high 
levels in flowers and siliques, and at lower levels in stems, ro- 
sette leaves, and cauline leaves; no expression was observed 
in roots (Figure 7B). This expression pattern is consistent with 
the phenotype conferred by er. ER transcripts are more abun- 
dant in younger leaves than in mature rosette leaves (Figure 
7B), suggesting that ER may be involved in the cell expansion 
process during leaf formation. 




Figure 7. Expression of 5R 

(A) Expression of ER transcripts of different mutant alleles. Total RNA 
was isolated from inflorescences of 5-week-oJd wild-type (wt) plants 
(Col and WS), Ler plants, and plants homozygous for each er allele. 
Five micrograms of total RNA was loaded in each lane. The membrane 
was probed with the ER cDNA that hybridized with a 33~kb band, and 
then the same blot was reprobed with the 18S rDNA as a control. 

(B) Expression of ER transcripts of different organs. Wi Id-type Co* plants 
(3 to 5 weeks oW) were dissected before RNA extraction. "Young resetter 
designates the aerial parts of 2-week-old seedlings. Five micrograms 
of total RNA was loaded in each lane. The blot was probed with the 
ER cONA and reprobwi with the 18S rDNA as a control. 



The ER Gene Is Most Highly Expressed at or around 
the Shoot Meristem 

The expression of ER transcripts was analyzed for different 
er alleles and in the major plant organs by RNA gel blot analy- 
sts (Figures 7A and 76). As previously described, no detectable 
transcript of any size was observed in er-104 and er~105 (Fig- 
ure 7A). In er-7(M, T-DNA was inserted at the 5' untranslated 
region by deleting 28 nucleotides* which possibly include a 
transcriptional initiation point. Therefore, the gene may not be 
transcribed. In er-105, ~4 kb of DNA of unknown origin was 
inserted within the ER locus (Figures 4A and 4C); consequently, 
the transcript may be unstable. A transcript of normal size was 
detected in RNA gel blots from the other four er alleles, in- 
cluding two point mutation alleles er-103 and Ler (Figure 7A). 



DISCUSSION 

All of the five er alleles that we have isolated from Col and 
WS ecotypes showed phenotypes similar to that conferred by 
Ler, although a precise analysis showed the different degrees 
of severity (Figures 1 and 2). er plants have very short inflores- 
cence stems, siliques, and pedicels (Figure 1). However, it is 
unlikely that ER participates in the general process of ceil elon- 
gation for the following reasons. First, the er mutation affects 
not only elongation of the organs. Regarding inflorescence de- 
velopment, the er mutation seems to alter the timing of stem 
elongation and flower bud formation; as a consequence, 
flowers and flower buds are clustered at the top of the in- 
florescence (Figure 2). Such an alteration in inflorescence 
architecture is not observed in cell elongation mutants, such 
as dwarf and auxin resistant2 (Bowman, 1993). Second, the 
defects of the er mutation are restricted to the above-ground 
portion of the plants and are most conspicuous in inflorescence 
stems and siliques. Third, ER transcripts are not uniformly ex- 
pressed throughout the plants. The expression is highest at 
or around the apical meristem and is absent in roots (Figure 
78). These observations led us to presume that ER plays a 
role in coordination of cell growth patterns within the organ 
primordia initiated from the shoot apical meristem. 

The predicted structure of ER supports the hypothesis that 
ER participates in the coordination of cell growth patterns. The 
ER gene encodes a putative transmembrane receptor protein 
kinase with extracellular ligand binding domain (Figures 6A 
to 6C). This strongly suggests that ER participates in intercel- 
lular signal transduction, perhaps through interaction with 
extracellular ligands that activate the intracellular kinase do- 
main. In plants, several transmembrane RLKs have been 
identified (reviewed in Walker, 1994). Based on the structural 
similarity of the extracellular domains, they are classified into 
three groups, namely, the S domain, epidermal growth fac- 
tor-like domain, and LRR domain (reviewed in Walker, 1994). 
None of these domains has been shown to function as a recep- 
tor, nor have their ligands been identified. The S domain was 
originally found in self-incompatibility locus glycoproteins 
(SLGs) in Brassica. The physical linkage of S domain RLKs 
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and SLQs in Brassica suggests that they function as a pair 
in the self-incompatibility recognition between pollen and 
stigma (Stein et al. f 1991). TMK1 and RLK5 have been reported 
as receptor-like serine/ threonine protein kinases of Arabtdopsis 
with extracellular LRR domains, although their biological func- 
tions are unknown (Chang et aJ., 1992; Walker, 1993). 

LRR motifs are found in a variety of proteins with diverse 
functions and are thought to be involved in protein-protein in- 
teractions (Kobe and Deisenhofer, 1994). Extracellular LRRs 
are found in numerous transmembrane and membrane- 
attached proteins. Functionally, they are classified into two 
groups. One group contains several families of signal transduc- 
tion receptors. The mammalian luteinizing hormone/chorlonic 
gonadotropin (LH/CG) receptor and follicle-stimulating hor- 
mone (FSH) receptor participate in transmembrane signal 
transduction of peptide hormones (McFariand et al., 1989; 
Heckert et al., 1992). A series of experiments, such as the trun- 
cation of the LH/CG receptor and domain swapping between 
the LH/CG receptor and the FSH receptor, has identified the 
extracellular LRR domain as a specific binding site for pep- 
tidic hormones (Braun et al., 1991). The human Trk tyrosine 
kinase receptor is encoded by a proto-oncogene that has both 
an extracellular LRR domain and a cytoplasmic kinase domain 
(Schneider and Schweiger, 1991). In plants, the tomato dis- 
ease resistance gene Cf-9 encodes a membrane-anchored 
glycoprotein with an extracellular LRR motif, which is most 
likely the receptor domain for the Avr9 peptide, the fungus aviru- 
lence gene product (Jones et al., 1994). 

The second group contains adhesive proteins. Human GPIb, 
a receptor for the von Wiilebrand factor, mediates the adhe- 
sion of platelets (Lopez et al., 1987, 1988). The role of the LRR 
domain in adhesion is suggested because the polypeptide frag- 
ment of GPIb containing LRR binds to the von Wiilebrand factor 
(Handa et al., 1986; Titani et al., 1987). Drosophila Toll, chaop- 
tin, and connect in have extracellular LRRs, and their functions 
are required to orient cells during development (Hashimoto 
et al., 1988; Reinke et al., 1988; van Vactor et al., 1988; Nose 
et al., 1992). The ectopic expression of Toll, chaoptin, and con- 
nectin in Drosophila culture cells causes cell aggregation (Keith 
and Gay, 1990; Krantz and Zipursky, 1990; Nose et al., 1992). 
These experiments have demonstrated that they are indeed 
cell adhesion molecules. Tod protein also has a signaling role 
in the establishment of dorsal-ventral polarity in the Drosoph- 
ila embryo via the interaction with a soluble extracellular ligand, 
a processed form of the Spatzie protein (Schneider et al. , 1994). 
Thus, there may be LRR proteins that have both signaling and 
adhesive functions. 

Considering the structural analogies of ER to mammalian 
receptors and cell adhesion molecules, it is not unreasonable 
to speculate that ER has a function similar to that of its animal 
counterparts. The first possibility is that ER mediates a signal 
from the apical men stem, perhaps by binding to a hypotheti- 
cal peptidic ligand, which is analogous to animal growth factors. 
Alternatively, ER may directly promote cell membrane-cell wall 
attachment and regulate cell shapes. The adhesion of animal 
ceils is mediated by the interaction of plasma membrane recep- 



tors, collectively named integrins, and a family of extracellular 
matrix glycoproteins that include the von Wiilebrand factor, 
fibronectJrts, and vitronectins (Schindler et al., 1989). A previous 
study using soybean cells suggested that an analogous sys- 
tem is involved in cell membrane-cell wall attachment in plants 
(Schindler etal., 1989). It is intriguing that ER shares a similar 
LRR motif with a receptor for the von Wiilebrand factor (Lopez 
et al., 1987, 1988). 

Whatever the case may be, a cytoplasmic kinase domain 
in ER suggests the presence of a target molecule phos- 
phorylated by ER. Protein phosphatase (designated KAPP, a 
kinase-associated type 2C protein phosphatase) was identified 
by interaction cloning with RLK5. RLK5 and KAPP associate 
in a phosphorylation-dependent manner (Stone et at., 1994). 
It is possible that ER also transduces signals via association 
with protein phosphatase, because ER shares high similarity 
with RLK5 in both the extracellular LRR and cytoplasmic ki- 
nase domain. The highest expression of ER transcripts in bud 
clusters (Figure 7B) suggests that the major site of interaction 
of ER with an unidentified ligand is at or around the shoot 
meristem, although we still do not know about the post- 
transcript ional regulation of ER. This interaction may regulate 
the subsequent development of organs, such as leaves de- 
rived from a vegetative meristem, stems derived from an 
inflorescence meristem, and siliques derived from a floral 
meristem. 

The gene structure of ER provides clues to the evolution 
of the LRR domain. In the sequences encoding all 20 LRRs 
of ER, introns are present at the exact same position, between 
the second and third nucleotide coding leucine at position 
13 (underlined) for the consensus P**LG*L ## L**L*LVN*L # G*I 
(asterisks represent noncon served amino acids) (Figures 5 and 
6B). Interestingly, eight of the 10 LRRs of the mammalian 
LH/GC and the FSH receptors and two of 10 LRRs of the sea 
anemone G protein-coupled receptor are disrupted by the in- 
trons at the homologous position, again between the second 
and third nucleotide for the leucine at position 14 (underlined) 
of their consensus P**AF"L**L**L*L* # N*L**I (Heckert et al., 
1992; Tsai-Morris et al., 1992; Noth acker and Grimmeiikhuijzen, 
1993). The consistent and repeated location of introns in ER 
and in genes encoding animal receptors has implications for 
LRR evolution. First, the LRR motif may have evolved by axon 
duplication. Second, a single LRR unit may be the ancestral 
form for both the animal and plant kingdoms. The presence 
of an intron in a single LRR-encoding sequence could pro- 
vide an opportunity to truncate the LRR domain, possibly by 
alternative splicing, to change the specificity for the particu- 
lar ligand. The alternative splicing has been observed in the 
rat LH/CG receptor and the sea anemone receptor (Kbo et al. , 
1991; Aatsinki et al., 1992; Noth acker and Grimmeiikhuijzen, 
1993). To date, there is no evidence for alternative splicing of 
ER transcripts. 

In this study, we have shown that ER encodes a putative 
transmembrane receptor kinase that regulates organ shape. 
Defining components of the signal transduction pathway 
mediated by ER is our next major challenge. Currently, we are 
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performing a genetic screen to isolate additional mutations 
that show a genetic interaction with er. Thew genetic approaches 
together with the complementary molecular approach, such 
as a search for molecules that interact with the LRR and the 
protein kinase domain of ER, may elucidate the nature of cell- 
cell communication controlling the development of plants. 



METHODS 



Isolation of erecta Alleles 

er-101 was isolated from x-ray-irradiated seed populations of Arab/dop- 
sis thaliana ecotype Colombia (Col), er-102 and er-105 were isolated 
from fasKieutron-Hrradiatad Col seed populations (Leftle Seeds, Round 
Rock, TX). er-103 was isolated from ethytmethane suffonate-treated 
Col seed mutagen ized in our laboratory, er-104 was isolated from a 
T-DNA-mutagenized population of ecotype Wassitewskija (WS) as fol- 
lows. T. Oosumi and R.F. Whittier established a collection of ~500 
T-DN A-transformed Arabktopsls lines in the WS ecotype by Agrobac- 
tBfium fu/neracrfsns-rnediated plant transfcxmation based on the method 
of Chang et ai. (1994). The transformation vector was pGDW32 and 
is described by Wing et al. (1989). Screening was performed based 
on the phenotype of inflorescence. Mutant alleles were backcrossed 
twice into wikMype Col before the experiments described here. Genetic 
analysis confirmed a single recessive nature of the mutations that are 
allelic to Landsberg erecta (Ler) (data not shown). Plants were grown 
under continuous light. 



Scanning Electron Microscopy 

Samples were fixed in FAA (50% ethanol, a 7% formaldehyde, and 
5% acetic acid) overnight at 4°C and dehydrated through a graded 
ethanol series. Samples were critical point dried with liquid CO?, 
mounted, sputter coated with gold, and viewed in a scanning electron 
microscope (model HCP-2; Hitachi, Tokyo, Japan). 



Isolation of the ER Gene 

Plant DNA was isolated by using the ceryrtrimethytammonium bromide 
method (Watson and Thompson, 1986), and yeast DNA was isolated 
as described by Ausubel et al. (1989). An Arabtdopsrs Col XYES cONA 
library (a gift of J. Mulligan and R. Davis, Department of Biochemis- 
try, Stanford University, Stanford, CA) was screened to isolate the ER 
gene. Initially, 375,000 plaques were screened by using Hindlll frag- 
ments of the rescued plasmids (6Ea and 6Xc) as probes. Five clones 
corresponding to the 0.6-kb transcripts and one partial clone (2.4 kb) 
corresponding to the &3-kb transcripts were isolated. In vivo excision 
of DNA was performed according to E J ledge et al. (1991). Another 
375,000 plaques were screened by using a partial ER done (EcoRI 
digests of pKUTlOO; insert size of 2.4 kb) as a probe. Five additional 
clones were isolated, and one of them (a 2 kb) encoded a single open 
reading frame, lb isolate wikMype genomic clones, a PI library (Liu 
et al., 1995) was screened by using the EcoRHCbal fragments of the 
rescued ptasmnte 6Ea and 6XC as probes (designated L and R, respec- 
tively; see Figure 4A). One done (61H10) hybridized with both probes, 
and another (28D7) hybridized only with the R probe. The a8-kb re- 



gion of overlap between these two clones induded the Qj6-kb transcript 
coding region as well as the 5' end of the as-kb transcript cotfiig region. 

Isolated cDNAs and two Xbal fragments of Pi genomic done 61H10 
were subcloned into pBluescriptll SK+ (Stratagem) for sequencing. 
Sequencing was performed using a Taq Dye Primer Cycle sequenc- 
ing kit (Applied Biosystems, Foster City, CA) on an automated DNA 
sequencer (model 373A; Applied Biosystems). Sequencing of cDNA 
and genome PI clones revealed a nucleotide base substitution within 
the ER coding region, which does not affect the amino acid sequence 
(data not shown). This is likely due to the difference of the population 
of plants used to construct the cDNA and genome PI libraries, even 
though they are both designated as ecotype Col. For DNA gel blot 
analysis, plant and yeast genomic DN As were digested with appropri- 
ate restriction enzymes, separated on 0.7% agarose gels, and 
alkaJi-blotted onto Hybond N+ membrane (Amersham, Arlington 
Heights, IL). DNA get blot hybridizations under stringent conditions 
were performed as recommended by the manufacturer (Amersham). 
DNA probes were random prime-labeled with phosphorus-32 dCTP 
by a BcaBEST labeling kit (TaKaRa, Kyoto, Japan). 



Isolation of the er Gene from Ler and Sequencing of 
er Alleles 

The Arabidopsis Ler XZAPII cDNA library (a gift of K. Goto. Institute 
of Chemical Research, Kyoto University, Kyoto, Japan) was used to 
isolate the mutant er gene. Approximately 250,000 plaques were 
screened, and two partial clones (1.9 and 1.7 kb) were isolated. Puri- 
fied phage were excised in vivo, according to the manufacturer's 
instruction (Stratagene), and generated plasmids (pKUTl70 and 
PKUT180) were subcloned and sequenced. Alternatively, total RNA 
was isolated from aerial parts of Ler and er-103, and cDNA was syn- 
thesized using a first-strand cDNA synthesis kit (Amersham). First-strand 
cDNA was used as a template for the polymerase chain reaction to 
amplify the er cONA for 40 cycles at 93°C for 1 min, 54°C for 2 min, 
and 70°C for 15 min with Takara Taq (TaKaRa). Sequencing was per- 
formed with the Taq DyeOeoxy lerminaax Cycle seo^ierK»ng kit (Appl^ 
Biosystems). 



RNA Gel Blot Analysis 

Total RNA was prepared by a general SDS-phenoJ method, with modifi- 
cations as follows. After LfCI precipitation, RNA samples were dissolved 
in TE (10 mM Tris, 1 mM EDTA, pH ao) containing 03 M sodium ace- 
tate, stored at 0°C for 25 min, and centriruged. Iscpropanol (05 volume) 
was added to the supematants, and total RNA was obtained as a predpi- 
tate. These steps effectively eliminated polysaccharides, especially 
in samples from roots and siliques. RNA was denatured by glyoxal; 
electrophoresis, transfer of RNA to nylon membranes (GeneScreen 
Plus; Du Pont), and hybridization under stringent conditions were per- 
formed as recommended by the manufacturer. ER cDNA (EcoRI 
fragment of pKUT160) was random prime-labeled with phosphorus-32 
by a BcaBEST labeling kit and used as a probe. An insert from a done 
containing pea IBS rDNA sequences (Jorgenson et al.. 1987) was used 
as a control. 
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